ADA0855 


Approved  for  public  release; 
distribution  unllBited. 

DEPARTMENT 

of 

COMPUTER  SCIENCE 


D D 


Carnegie -Mel  Ion  University 

COPY  mum  TO  doc  does  m 

PeiMIT  flEY  1£&!B1£  PpOCIiOH 


klR  FOBCS  0».:-!<nS  01'  Sn.-NIIFIO  Ej5St.'JlCB  (APSO) 
,OHCB  OF  TEMSWIM  M BOo 

r.ov:f?orUu”  »^r«  u*  APH  XOO-XB  (7^,, 
Distributioti  is  uiiilsited. 

A.  D.  BLOSE  ^ 

fechnlcal  Infonaeition  Officei^ 


ANALYSIS  OF  LANGUAGES 
FOR 

MAN-MACHINE  VOICE  COMMUNICATION 


A DISSERTATION 

SUBMITTED  TO  THE  COMPUTER  SCIENCE  DEPARTMENT 
AND  THE  COMMITTEE  ON  GRADUATE  STUDIES 
OF  STANFORD  UNIVERSITY 
IN  PARTIAL  FULFILLMENT  OF  THE  REQUIREMENTS 
FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 


by 

Robert  Gary  Goodman 
Carnegie-Kcllon  University 
•Way  197b 

Reprinted  September  1976 


^ D D C 

rn'Jr;F'.'rT. 

|i  i 

‘V  FEB  14  1977 

i n 

U ULiH 


I i 

r I U 


D 


This  work  was  supported  by  the  Defense  Advanced  Research  Projects  Agency 
under  contrac  .A620-73-C-0074  and  monitored  by  the  Air  Force  Office  of 


Scientific  Re 


jrch. 


DISTRIBUTION  STilTEMENT  A 

Approved  fox  public  lelOQse; 
Distribution  Unlimi:3cl 


ANALYSIS  OF  LANGUAGES 
FOR 

MAN-MACHINE  VOICE  COMMUNICATION 

Robert  Gary  Goodman,  Ph.D. 
Stanford  Universily,  1976 


Comparing  the  relative  performances  of  speech  understanding  systems  has 
always  been  difficult  and  subject  to  speculation.  Different  tasks  naturally  require 
different  vocabularies  with  varying  acoustic  similarities.  Moreover,  constraints  imposed 
by  the  syntax  may  make  recognition  easier,  even  for  vocabularies  with  high  ambiguity. 
This  thesis  presents  an  analysis  of  ambiguity,  restriction  and  complexity  in  speech 
understanding  system  languages.  The  ambiguily  considered  involves  the  similarity  of 
acoustic  signals  and  the  ambiguily  it  causes  at  other  levels  of  recognition.  Phonemes 
spoken  in  isolation  are  misi  .^cognized  by  both  man  and  machine.  Words  and  phrases 
having  similar  phonetic  structure  are  confused.  This  confusion  increases  the 
complexity  with  connected  speech  but  syntactic  and  other  higher  levels  of  knowledge 
provide  additional  constraints  to  reduce  the  ambiguity.  This  thesis  examines  ambiguity 
and  complexity  at  the  phonetic,  lexical  and  syntactic  levels.  Ambiguity  may  also  occur 
at  the  semantic  and  user  discourse  levels.  The  concepts  presented  here  can  be 
extended  to  these  levels. 

Measures  are  developed  which  permit  the  relative  comparison  of  the  difficulties 
of  a given  set  of  recognition  tasks.  We  present  notions  of  equivalent  vocabulary  size, 
branching  factor,  effective  branching  factor,  search  space  size  and  search  space 
reduction.  All  of  these  are  useful  as  relative  comparison  measures.  Briefly,  the  plan 
of  research  is  to  investigate,  in  order:  phonetic  ambiguity,  word  ambiguity,  lexical 
ambiguity,  syntactic  constraint  and  the  combined  effects  of  lexical  ambiguity  and 
syntactic  constraint. 

First,  the  major  source  of  ambiguity,  the  acoustic  speech  signal  Itself,  Is 
considered.  Several  measures  for  quantifying  phonetic  ambiguily  are  investigated  and 
compared.  These  measures  provide  a basis  for  Ihe  computation  of  lexical  and  phrasal 
ambiguity. 

A model  for  lexica!  ambiguity  is  presented  which  utilizes  the  knowledge  of 
phonetic  ambiguity  and  a general  representation  of  the  vocabulary  to  estimate  the 
probability  that  an  acoustic  realization  of  some  sequence  of  idealized  phonemes  will 
result  in  incorrect  recognition.  The  average  expected  number  of  words  retrieved  In  an 
syntactically  unconstrained  lexical  search  is  computed  from  these  probabilities.  This 
number  is  called  the  equivalent  size  of  the  vocabulary.  The  10  digits,  for  Instance, 
have  an  equivalent  size  of  1. 1 9 words,  white  the  equivalent  size  of  the  spoken 
alphabet  fa",  "b", . . . "z")  is  3.87. 

The  syntax  of  languages  for  speech  understanding  systems  Imposes  restrictions 
on  the  number  of  word  pairs,  triples,  etc.  which  can  occur  in  the  language.  These 
limitations  can  dramatically  reduce  the  tolal  size  of  the  search  space:  One  of  the 
languages  investigated  has  a 250  word  vocabulary  and  an  average  sentence  length  of 


8 words.  Syntactic  restrictions  reduce  the  branching  factor  to  7.3.  That  is,  on  the 
average,  one  must  disambiguate  among  7 words. 

Equivalent  vocabulary  size  may  be  viewed  as  a branching  factor  in  the  case 
whore  there  are  no  syntactic  constraints.  Thus,  lexical  ambiguity  and  syntactic 
restriction  are  measured  in  the  same  terms.  This  unification  allows  combined  effects  of 
vocabulary  ambiguity  and  syntactic  complexity  to  also  be  viewed  as  a branching  factor. 
Two  models  for  complexity  of  connected  speech  are  defined.  A "best"  behavior  model 
which  assumes  that  word  boundaries  are  known  and  therefore  the  only  confusioha  that 
may  arise  are  when  two  (or  more)  phonetically  similar  words  have  the  same  contexts. 
The  effective  branching  factor  obtained  can  be  viewed  as  an  optimistic  representation 
of  the  expected  behavior  of  the  system.  A "worst"  case  model  is  also  discussed. 

The  important  contribution  of  this  thesis  is  that  it  provides  a way  to 
characterize  the  relative  difficulties  and  accomplishments  of  different  speech 
understanding  systems.  Vocabulary  size  is  not  a good  measure  of  lexical  complexity; 
some  other  measure  of  vocabulary  size,  normalized  for  relative  ambiguity  would  bo 
better.  The  number  of  production  rules  is  not  a useful  measure  of  grammatical 
complexity.  In  fact,  quite  the  opposite  may  be  true;  more  rules  imply  more  constraint. 
Some  other  measure,  such  as  the  average  number  of  alternatives  at  each  choice  point 
would  be  better. 


ii 


ACKNOWLEDGEMENTS 


It  ? greai  pleasure  to  have  this  opportunity  to  express  my  appreciation 
to  Rflj  Roddy  for  his  constant  technical  guidaiKe  and  moral  support.  He  has 
been  a true  friend  and  an  excellent  adviser. 

I am  grateful  to  my  thesis  committee:  John  McCarthy,  Arthur  Samuel,  and 
Cordell  Green  for  their  helpful  comments  concerning  this  thesis.  In  addition,  the 
many  interesting  discussions  with  my  colleagues  Lee  Erman,  Fredrick  Hayes-Roth, 
Ellis  Cohen  and  John  McDermott  were  of  great  value  to  the  success  of  this 
research. 

I am  deeply  indebted  to  Mike  Kelly  for  his  personal  interest  in  the 
completion  of  this  work. 

This  dissertation  is  dedicated  to  my  wife,  Ann,  whose  help  and 
encouragement  have  been  invaluable. 

This  work  was  bogun  at  Stanford  University  and  continued  at  Carnegie- 
Mcllon  University.  I am  grateful  to  the  members  of  both  Computer  Science 
Departments  for  providing  assistance,  support  and  outstanding  research 
facilities.  This  work  was  also  supported  by  the  Advanced  Research  Projects 
Agency  of  the  Department  of  Defense. 

Special  thanks  go  to  Brian  Reid  for  his  valuable  and  patient  assistance 
during  the  preparation  of  this  thesis  and  to  Bruce  Lowerre  for  his  export 
programming  support. 


TABLE  OF  CONTENTS 


I.  INTPODUCTIOI'J  I 

Amoiijuily  in  Speech  Understanding  Systems 
Previous  Research 

Outline  of  the  Dissertation  Presentation 

II.  PHONETIC  AMBIGUITY 9 

Phonetic  Ambiguity  Measures 
Articulatory  Model 
Validation  of  the  Model 

III.  LEXICAL  AMBIGUITY 17 

The  Nature  of  Lexical  Ambiguity 
A Lexical  Ambiguity  Measure 
Word  Ambiguity  Model 
Interpretation  of  Resulls 

IV.  SYNTACTIC  RLSTf?ICTION  IN  SPEECH  UNDERSTANDING  TASKS  ...  31 

Meof  .ires  of  Grammatical  Complexity 

Dynamic  Branching  Factor  - A Measure  of  Syntactic  Restriction 
Syntactic  Search  Space 

V.  C0K4PI.EXITY  IN  CONNECTED  SPEECH  - A restricted  Model A6 

Lexical  Ambiguity  and  Syntactic  Restriction 
Ambiguity  Analysis  in  the  Restricted  Model 
Search  Space  Reduction 

VI.  COMR.EXITY  IN  CONNECTED  SPEECH  - A General  Model 56 

Ambiguity  in  Connected  Speech 
General  Ambiguity  Model 
Discussion  of  Results 

VIE  RESULTS  OF  LANGUAGE  ANALYSIS 62 

Description  of  the  Tasks 
Discussion  of  Results 

VIII.  COfiJCLUSlONS 69 

Contributions 

Directions  for  Future  Research 
Implications  of  the  Model  for  Language  Design 

REFERENCES 76 

APPENDICES 78 

A.  Phonetic  Ambiguity  - Itakura  ktotric 

B.  Phonetic  Ambiguity  - Articulatory  Model 

C.  Task  Descriptions 


iv 


LIST  OF  FIGURES 


Figiiro  1-1.  Some  simple  vocabtilitrips  with  intuitive  complexifies  2 

Fij’ure  1-2.  Illustration^  ot  comparative  recognition  rates  4 

Figure  2-1,  Articulatory  model  - Features  and  allowed  values  13 

Figure  2-2.  Flow  chart  tor  theoretical  phonetic  ambiguity  model  16 

f igure  3-1,  An  information  channel  and  its  channel  matrix  19 

Figure  3-2.  Word  network  example,  "A"  and  "Q"  24 

Figure  3-3.  Flow  diagram  tor  word-to-word  probability  calculation  26 

Figure  3-4.  Calcul.dion  ot  n(Sj/Sj)  28 

Figure  3-5.  Results  ot  Lexical  Ambiguity  analysis  29 

Figure  4-1,  0NF  definition  tor  the  example  APEX  33 

Figure  4-2.  Example  ot  a Grammar  Network  34 

Figure  4-3.  Word  sequences  for  the  example  APEX  36 

Figure  4-4,  Some  simple  measures  ot  grammatical  complexity  37 

Figure  4-5.  State  probabilities  tor  the  example  APEX  39 

Figure  4-6.  Branching  factors  for  the  tasks  studied  42 

Figure  4-7.  Log  ot  search  space  size  44 

Figure  5-1.  BMF  definition  for  the  Lizard  task  47 

Figure  5-2.  An  examp.c  of  a sub-vocabulary  tor  the  Lizard  task  43 

Figure  5-3.  Results  of  c^-mplexity  analysis  50 

Figure  5-4.  S.?arch  space  reduction  ratios  55 

Figure  6-1.  Comparison  of  "best"  and  "worst"  case  branching  factors  60 

Figure  6-2.  Comparison  of  "best"  case  analysis  tor  words  and  syllables  60 

Figure  7-1.  Results  of  language  analysis  63 


V 


I 


1.  INTRODUCTION 

Comparing  the  relative  performances  of  speech  understancfmg  systems  has 
always  been  difficult  and  subject  to  speculation.  Different  tasks  naturally  require 
different  vocabularies  with  varying  acoustic  similarities.  Moreover,  constraints  imposed 
by  the  syntax  may  make  recognition  easier,  even  for  vocabularies  witn  high  ambiguity. 
This  thesis  presents  an  analysis  of  ambiguity,  restriction  and  complexity  in  speech 
understanding  system  languages.  The  ambiguity  considered  involves  the  similarity  of 
acoustic  signals  and  the  ambiguity  it  causes  at  other  levels  of  recognition.  Phonemes 
spoken  in  isolation  are  misrecognized  by  both  man  and  machine.  Words  and  phrases 
having  similar  phonetic  structure  a*'e  confused.  This  confusion  increases  the 
complexity  with  connected  speech  but  syntactic  and  other  higher  ievels  of  knowledge 
provide  additional  constraints  to  reduce  the  ambiguity.  In  this  thesis,  we  will  examine 
ambiguity  and  complexity  at  the  phonetic,  lexical  and  synlactic  levels.  Ambiguity  may 
also  occur  a!  the  semantic  and  user  discourse  levels.  We  believe  that  the  concepts 
presented  here  can  be  extended  to  these  levels  in  an  analogous  manner. 

1.1.  Ambiguity  in  Speech  Understanding  Systems 

To  illustrate  some  of  the  issues  relating  to  complexity,  consider  the  first  two 
vocabularies  shown  in  figure  l-l.  The  first  vocabulary  is  the  spoken  letters  "0  , "D 
and  "V",  while  the  second  is  comprised  of  the  three  digits  "ONE",  "TWO"  and  "THREE  . 
It  woul  ' not  be  difficult  to  elicit  opinions  as  to  which  of  these  two  vocabularies  would 
be  easier  to  recognize;  and,  most  likely,  there  would  be  a consensus.  In  ft'is  case, 
intuition  has  given  the  correct  answer.  Consider  now,  vocabulary  3 which  contains  the 
spoken  letters  "A",  "B"  and  “C".  Would  vocabulary  2 be  easier  to  recognize  than 


VocabuiTu  li 


<B>  <IY> 
<D>  <IY> 
<V>  <1Y> 


"B" 

"0" 
Hy  H 


Vocabuiary  2i 


ONE" 

<U> 

<AN> 

TIJO" 

<T> 

<IH> 

THREE" 

<TH> 

<ER> 

<N> 

<UU> 

<1Y>  or  <TH>  <R>  <IY> 


Vocabulary  3> 

"A"  <EH> 
"B"  <B> 
"C"  <S> 


<IH>  or  <EH>  <IX> 
<IY> 

<IY> 


Figure  1-1.  Some  simple  vocabularies  with  intuitive  compItMiti 


3 


vocabulary  3?  Again,  opinions  are  easy  to  come  by,  but,  in  this  case  there  may  not  be 
agreement.  An  example  of  the  performance  of  an  isolated  word  recognition  system  will 
serve  to  illustrate  that  the  number  of  words  in  a vocabulary  may  not  be  indicative  of 
its  complexity,  Ilakura[1976],  in  his  word  recognition  system,  investigated  two 
vocabularies.  The  first  vocabulary,  called  the  alpha-digit  vocabulary,  contains  the  26 
letters  of  the  English  alphabet  and  the  ten  digits.  The  other  is  a vocabulary  of  250 
Japanese  geographical  names.  The  results,  shown  in  figure  1-2,  were  that  Itakura 
achieved  88.67.  recognition  for  the  alpha-digit  vocabulary  and  97.37  with  the 
geographical  names  Why  is  it  that  the  alphabet  and  digits  are  more  difficult  to 
recognize  than  the  250  names?  One  might  guess  that  the  names  were  multi-syllabic 
and  phonetically  dis-similar.  While  Itakura  did  not  list  the  names,  it  was  stated  that 
there  were  3.5  syllables  per  word,  on  the  average.  The  questions  raised  here  will  be 
answered  in  chapter  3. 

To  illustrate  the  effects  of  syntactic  constraint,  consider  the  results  of  Baker  and 
Bahl[1975],  also  shown  in  figure  1-2.  The  languages  used  were  telephone  numbers 
and  the  "New  Raleigh"  language  with  vocabulary  sizes  of  10  words  and  250  words, 
respectively.  The  recognition  rates  for  these  two  tasks  are  roughly  the  same  even 
though  one  has  25  limes  as  many  words  as  the  other.  The  reason  could  be  because 
the  250  word  vocabulary  is  unambiguous  or  if  could  be  due  to  the  constraint  imposed 
by  the  syntax,  A more  precise  answer  will  appear  in  later  chapters. 

In  this  thesis  we  want  to  develop  some  measures  which  will  permit  tl.e  relative 
comparison  of  the  difficulties  of  a given  set  of  recognition  tasks.  We  will  present 
notions  of  equivalent  vocabulary  size,  branching  factor,  effective  branching  factor, 
search  space  size  and  sea*^ch  space  reduction.  All  of  these  are  useful  as  relative 


comparison  measures. 


4 


ISOLATED  UORDSUtakurn,  1975) 
Vocabulary 
Alpha-Digi t 

Japanese  Geographical  Names 


Recognition  Rejection 

Error 

Rated) 

Rated) 

Rate (Y) 

88. B 

0 

11.4 

37.3 

1.7 

1.0 

CONSTRAINED  LANGUAGES (Baker  and  Bah<.  1975) 


Telephone  Numbers (7  decimal  digits) 
"Neu  Raleigh  Language" 


X Correct  X Correct 

Uords  Sentences 

97. A 89 

98.3  81 


Figure  1-2.  Illustrations  of  comparative  recognition  rates. 


5 


1.2.  Pr«viou*  R«airch 

Virtually  evyry  speech  understanding  system  faces  ths  problem  of  phonetic 
ambiguity;  thus,  there  art  many  metrics  which  attempt  to  measure  the 
similarity/difference  of  acoustic  events.  We  have  chosen  the  minimum  prediction 
residual  metriclltakura,  1975]  for  use  in  this  thesis.  This  metric  is  a measure  of  the 
distance  or  dissimilarity  between  segments  of  discrete-time  signals. 

There  is  considerable  phoneme  confusion  data  for  human  perception. 
Vowels[Petersen  and  Barney,  1952;  Ladeforged  and  Broadbent,  1957]. 
Consonants[Miller  and  Nicely,  1955].  The  Miller  ar.d  Nicely  paper  discusses  some 
theoretic  concepts  Of  information  content  of  various  distinguishable  characteristics  of 
the  perception. 

Them  is  no  Known  previous  work  in  lexical  ambiguity,  except  that  in  the  Speech 
Understanding  Report[Newell,  elal.,  1971,  appendix  10]. 

Several  papers  from  the  field  of  programming  languages  and  formal  grammar 
theory  discuss  the  effects  of  context;  they  have  limited  applicability  to  speech 
recognition  systems,  however.  A summary  of  the  methods  is  found  in  "Translator 
Writing  Systems"[Feldman  and  Cries,  1968]. 

1 .3.  Outline  of  the  Dissertation  Pres  - ilation 

Briefly,  the  plan  of  research  is  to  investigate,  in  order:  phonetic  ambiguity,  word 
ambiguity,  lexical  ambiguity,  syntactic  constraint  and  the  combined  effects  of  lexical 
ambiguity  end  syntactic  constraint.  A short  preview  of  each  chapter  follows. 


6 


In  chapter  2 we  consider  the  major  source  of  ambiguity;  i.e.,  the  acoustic  speech 
signal  itself.  Several  measuies  for  quantifying  phonetic  ambiguity  are  investigated  and 
compared.  These  measures  provide  a basis  for  the  computation  of  lexical  r . irasal 
ambiguity  in  succeeding  chapters 

In  chapter  3 we  present  a model  for  lexical  ambiguity.  The  model  utilizes  the 
knowledge  of  phone-to-phone  confusions  from  chapter  2 and  a general  representation 
of  the  vocabulary  to  estimate  the  probability  that  an  acoustic  realization  of  some 
sequence  of  idealized  phonemes  will  result  in  incorrect  recognizion.  The  average 
number  of  expected  words  retrieved  in  an  syntactically  unconstrained  lexical  search  is 
computed  from  ;he*;e  probabilities.  This  number  is  called  the  equivalent  size  of  the 
vocabulary.  The  10  digits  have  an  equivalent  size  of  1.19  words,  while  the  equivalent 

size  of  the  spoken  alphabet  ("a",  “b", "z")  is  3.87.  This  shows  that  the  phonetic 

similarity  of  the  alphabet  is  greater  relative  to  the  digits.  This  result  is  not  surprising; 
it  is,  in  fact,  what  one  would  expect. 

Chapter  4 discusses  the  effects  of  syntactic  restriction  without  regard  to  the 
lexical  ambiguity.  The  syntax  of  languages  for  speech  understanding  systems  impose 
restrictions  on  the  number  of  word  pairs,  triples,  etc.  which  can  occur  in  the  language. 
These  limitations  can  dramatically  reduce  the  total  size  of  the  search  space.  The  IBM 
"New  Raleigh"  language  has  a 250  word  vocabulary  and  an  average  sentence  length  of 
8 words.  Syntactic  restrictions  reduce  the  branching  factor  to  7.3.  That  is,  on  the 
average,  one  must  disambiguate  between  7 words.  The  voice  programming  language 
used  by  Lowerre  has  only  a 37  word  vocabulary  and  an  average  branching  factor  of 
10.8.  Thus,  a 37  woid  vocabulary  may  provide  a more  stringent  test  of  a recognition 
system  than  a 250  word  vocabulary.  This  would  depend  also,  of  course,  on  the 
ambiguity  of  the  words  themselves. 


7 


Chaptei  5 examines  the  combined  effects  of  vocabulary  ambiguity  and  syntactic 
complexity,  but  ignoring  juncture  ambiguity  thal  further  complicates  connected  speech 
(this  can  be  thought  of  as  a "best"  behavior  model  or  as  a model  for  pause  separated 
speech).  This  model  assumes  that  word  boundaries  are  known  and  therefore  the  only 
confusiotiS  that  may  arise  are  when  two  phonetically  similar  words  have  the  same 
contexts.  The  effective  branching  factor  obtained  can  be  viewed  as  an  optimistic 
representation  of  the  expected  behavior  of  the  system. 

The  problems  of  connected  speech  are  addressed  in  chapter  6 Given  the  "best" 
behavior  model  for  complexity  of  chapter  5,  we  examine  the  limitations  of  that  model 
with  respect  to  the  problems  of  connected  speech.  Then,  a general  model  for 
ambiguity  analysis  of  connected  speech  is  developed.  This  model  measures  the 
ambiguity  assuming  that  there  is  some  uncertainty  about  Ihe  correctness  of  the 
recognition.  In  a sense,  this  may  be  viewed  as  a "worst"  case  model.  The  effective 
branching  factor  obtained  is  a pessimistic  measure  of  Ihe  ambiguities  which  may  arise. 

Chapter  7 contains  the  analysis  of  four  vocabularies  and  several  languages  of 
interest.  The  vocabularies  are  a set  of  31  phones,  the  10  digits,  the  spoken  alphabet, 
and  the  alphabet  and  digits  combined.  The  languages  are  CHESS,  VP,  LIZARD,  IBM, 
LLBAS  and  LLEXT.  CHESS  is  the  original  Hearsay-I  chess  task  language  VP  is  a voice 
programming  language  with  37  words  and  LIZARD  is  a small  version  of  VP  having  17 
words.  IBM  is  IBM's  "New  Raleigh"  language  of  english-like  sentences.  LLBAS  is 
Lincoln  Lab’s  "basic"  language  for  displaying  and  controlling  acoustic  data.  It  has  a 
vocabulary  of  236  words.  And  LLEXT  is  an  "extended"  version  of  LLBAS  having  a 410 
word  vocabulary.  Appendix  C contains  descriptions  of  these  vocabularies  and  tasks. 


There  are  many  ways  of  approaching  the  analysis  of  ambiguity  in  speech 


8 

understanding  tasks.  Each  new  idea  spawns  several  new  and  interesting  problems  and 
ideas.  The  methods  we  have  used  have  been  shown  to  be  reliable  relative  estimators 
o<  ambiguity,  although  no  claim  is  made  that  they  are  unique  or  complete.  This  work 
represents  the  best  analytical  tool  we  have  to  date  for  the  design  of  languages  for 
man-machine  communication.  These  issues  are  discussed  as  part  of  chapter  8 on 


conclusions  of  this  research. 


9 


2.  PHONETIC  AMBIGUITY 

The  major  source  of  ambiguity  in  speech  recognition  is  in  the  acoustic  signal 
itself.  Ambiguities  of  this  nature  must  be  dealt  with  at  all  levels  of  recognition.  This 
chapter  discusses  the  ambiguity  of  acoustic  events  and  investigates  several  measures 
for  quantifying  its  effects.  These  measures  provide  a basis  for  the  computation  of 
lexical  and  phrasal  ambiguity  in  succeeding  chapters. 

Vocal  production  is  accomplished  by  actions  of  the  articulatory  mechanism 
consisting  of  the  lungs,  vocal  chords,  tongue,  lips  and  throat,  mouth  and  nasal  cavities. 
While  the  articulators  can  assume  a wide  variety  of  positions,  only  a few  classes  are 
employed  by  any  one  language.  Each  separately  distinguishable  class  represents  the 
same  linguistic  unit,  called  a phoneme.  The  acoustic  realization  of  a phoneme  is  termed 
a phone.  These  realizations,  unfortunately,  do  not  fall  into  separable,  mutually 
exclusive  classes.  The  ambiguity  of  phones  is  well  documented  in  experiments  in  both 
human  perception  and  machine  recognition.  Some  confusion  exists  in  human  perception 
vith  high  quaiity  speech  when  the  phones  are  presented  in  isolation[Miller  and  Nicely, 
1955].  This  confusion  becomes  greater  when  the  signal  is  corrupted  by  noise. 
Ambiguity  in  machine  recognition  is  summarized  nicely  in  the  ARPA  Speech 
Understanding  Report[Newell,  1971].  This  report  also  discusses  ways  of  dealing  with 
ambiguity  in  speech  understanding  systems  and  provides  a good  general  reference  for 
the  subject. 

2.1.  Phonetic  Ambiguity  Measures 


Most  speech  recognition  systems  begin  by  segmenting  some  parametric 


10 


representation  of  the  acoustic  space  followed  by  classification  of  the  resulting 
segments.  Classificalion  attempts  to  assign  a phoneme-like  label,  or  labels,  to  each 
segment.  This  chapter  is  concerned  with  the  measurement  of  the  reliability  of  making 
these  classifications  In  particular,  we  wish  to  determine  the  probability  of  phone  pi 
being  recogni?ed  as  phone  p2  for  all  pairs.  Although  these  probabilities  are 
mathematically  well  defined,  they  cannot  be  calculated:  they  must  be  measured.  We 
will  discuss  three  ways  of  estimating  these  conditional  probabilities:  actual  counts, 
acoustic-parametric  metrics  and  theoretical  models. 

We  could  obtain  these  probabilities  from  octual  counts  using  some  existing 
recognition  system,  be  it  man  or  machine  This  is  usuallly  done  by  comparing  the 
output  of  the  classifier  with  an  accurate  hand  segmentation  and  labelling.  The  result  is 
the  classical  confusion  matrix  giving  the  frequencies  of  correct  and  incorrect 
classifications.  Conditional  probabilities  can  then  be  derived  from  these  frequencies. 
This  method  suffers  from  the  fact  that  large  amounts  of  data  are  required  to  provide 
accurate  estimates  and  rare  confusions,  in  general,  are  not  accounted  for.  Also,  the 
statistics  could  easily  be  biased  by  the  particular  design  of  the  system  used  to  gather 
them,  Careful  selection  of  the  data  is  necessary  in  order  that  all  phones  are 
represented  in  their  typical  contexts.  In  human  perception  data,  contextual  cues  which 
could  provide  information  helpful  for  recognition  must  be  eliminated. 

Another  method  of  obtaining  the  probabilities  would  be  by  direct  comparison  of 
parametric  representations  of  the  phones.  In  this  method,  a prototype  is  chosen  from 
the  set  of  realizations  for  each  phone.  Distances  between  phone  pairs  are  then  used 
to  estimate  the  conditional  probabilities.  This  method  is  also  dependent  upon  the 
original  data  and  the  choice  of  the  prototype.  It  does,  however,  consider  rare  events 
since  it  assigns  some  probability  to  every  possible  confusion. 


II 


All  speech  understanding  sy  terns  must  deal  with  the  uncertainty  of  phone- 
phoneme  similarity.  There  are  almost  as  many  methods  of  doing  this  as  there  are 
systems.  Clearly  then,  no  particular  method  stands  out  as  the  best.  The  choice  of 
which  method  to  use  for  estimating  phonetic  ambiguity  represents  a design  decision. 
Since  we  are  interested  in  a model  which  makes  relative  comparisons,  any  metric  which 
captures  the  essence  of  the  similarity  and  dissimilarity  of  the  phones  will  serve  the 
purpose.  Of  course,  the  closer  the  metric  models  the  true  probabilities,  the  more 
precise  the  outcome  of  the  model.  For  the  purposes  of  this  thesis,  we  have  chosen  the 
minimum  residual  metric  used  by  Itakura[1975].  Itakura’s  recognition  scheme  uses  this 
metric  along  with  a dynamic  programming  algorithm  for  temporal  matching  of  isolated 
words.  His  system  is  one  of  the  better  telephone  speech  recognition  systems.  The 
minimum  residual  metric  matches  spectral  characteristics  of  an  unknown  time  signal 
with  stored  reference  patterns.  Reference  patterns  are  essentially  linear  prediction 
models  of  the  phones.  The  result  of  the  matching  algorithm  is  the  log  of  the 
probability  that  the  unknown  is  a realization  of  the  stored  model.  Estimates  of  the 
phone  to  phone  conditional  probabilities  for  all  phone  pairs  are  obtained  by  treating 
the  reference  patterns  as  the  unknowns.  Appendix  A contains  a description  of  the 
algorithm,  a set  of  reference  patterns  for  the  phones  used  in  our  analysis,  and  the 
complete  phone  probability  matrix. 

Another  method  for  obtaining  these  probabilities  would  be  through  the  use  of  a 
theoretical  model.  A long  term  goal  is  to  develop  an  articulatory  position  model  for 
estimating  confusion  probabilities.  In  the  next  section,  we  present  such  a model.  At 
the  present  time,  the  model  is  not  accurate  enough  to  be  used  and  represents  an  area 
for  future  rscsarch. 


12 


2.2.  Articulatory  Modol 

An  articulatory  feature  model  was  chosen  as  the  basis  for  arriving  at  a 
theoretical  quantitative  measure  for  phonetic  ambiguity.  Articulator  positions  are 
easily  understood  and  represent  a natural  way  of  discussing  phonetic  phenomena.  The 
model  may  be  divided  into  five  phases:  selection  of  the  features  used,  definition  of 
phones  in  terms  of  these  features,  computation  of  distances  in  the  feature  space, 
inversion  of  distances  to  obtain  log  probabilities  and  normalization.  We  will  discuss 
each  of  these  in  order. 

The  articulatory  features  used  are  listed  in  Figure  2-1  in  decreasing  order  of 
influence.  The  set  of  allowed  values  is  given  for  each  feature. 

Having  decided  on  the  features  to  be  used,  each  phone  was  then  defined  in 
terms  of  these  features  in  a fairly  natural  way.  For  instance,  the  throat  is  open  for  all 
vowels,  turbulent  for  fricatives  and  constricted  for  the  other  consonants.  A complete 
list  of  the  definitions  of  the  phones  in  ttrms  of  their  feature  values  is  given  in 
appendix  B. 

The  next  step  is  to  quantify  the  difference  of  phones  based  upon  their  feature 
descriptions.  This  part  of  the  model  assumes  that  the  contributions  of  the  articulators 
are  essentially  independent.  Studies  in  co-articulation  have  shown  that  the  movements 
of  the  articulators  are  not  independent;  and,  later  we  will  find  that  our  model  does 
incorporate  one  co-articulatory  aspect.  But,  while  co-articulation  occurs  often,  its 
effects  are  minor.  Thus,  it  was  felt  that  the  independency  assumption  retains  sufficient 
information  for  our  purposes.  Furthermore,  while  these  secondary  effects  may  alter 


13 


Vocal  Tract  Closure 

0-  open 

C-  closed  or  constricted 
T-  turbulent 

Vocal  Chords 

V-  vibrating  (voiced) 

U-  not  vibrating  (unvoiced) 

Nasal  Cavity 

0-  open 
C-  closed 

Tongue  Poei t ion 

B-  back 
C-  central 
F-  front 

Tongue  Height 

L-  low 
M-  medial 
H-  high 

Tongue  Tip 

f1-  moving 
N-  not  moving 

L i ps 

N-  normal 
C-  closed 
R-  rounded 

Figure  2-1. 


Articulatory  Model 


Features  and  Allowed  Values. 


14 


absolute  judgements,  their  effects  will  be  partially  nullified  when  maKing  relative 
judgements  using  the  same  model. 

The  nature  of  the  articulators  and  their  features  >s  such  that  the  first  two  are 
very  strong  indicators  of  difference  while  the  others  are  valid  only  when  vocal  tract 
closure  characteristics  and  vocal  chord  vibration  are  the  same.  The  decision  part  of 
the  method  is  begining  to  emerge;  if  the  first  two  features  of  the  phones  are  the  same, 
compute  the  ambiguity  based  upon  the  other  features;  otherwise,  base  the  computation 
on  the  first  two  features  alone  This  decision  process  neglects  one  important 
consideration.  When  the  velum,  or  soft  palate,  is  opened,  the  combined  nasal  and 
mouth  cavity  presents  a significantly  different  impedence  for  the  driving  function 
produced  by  the  vocal  chords.  This  co-articulation  effect  was  incorporated  into  the 
model  by  splitting  the  voiced  feature  for  the  vocal  chords  into  V for  voiced  and  non- 
nasalized  and  N for  voiced  and  nasalized.  However,  for  purposes  of  the  decision 
process  described  above,  V and  N are  considered  equal.  The  consequence  of  this 
modification  will  become  clearer  in  the  discussion  of  influence  coefficients  in  the  next 
few  paragraphs. 

Using  our  assumption  of  indep'^ 'idency,  each  articulator  may  be  assigned 
"influence  coefficients"  independently.  These  coefficients  quantify  the  differences  in 
the  feature  values.  There  will  be  one  coefficient  for  each  difference  of  feature  values. 
Thus,  each  articulator  will  have  either  one  or  three  coefficients  depending  upon 
whether  it  has  two  or  three  feature  values.  For  example,  one  coefficient  for  vocal 
tract  closure  will  be  C(o,c)«C(c,o)  representing  the  influence  of  the  difference  between 
the  throat  being  open  and  the  throat  being  constricted.  Other  coefficients  for  closure 
would  be  C(o,t)-C(t,o)  and  C(c,t)-C(t,c)  representing  the  other  possible  ways  closure 


15 


may  differ.  This  gives  a total  of  17  coefficients.  They  are  also  given  in  appendix  B. 
These  coefficients  were  arrived  at  in  an  ad  hoc  manner  by  picking  some  starting 
values  and  modifying  them  until  the  response  of  the  model  seemed  reasonable. 

The  complete  flow  chart  for  the  computation  is  shown  in  Figure  2-2.  The  last 
box  is  a transformation  from  the  distances  computed  into  a space  of  log  probabilities 
ranging  from  0 to  -2.0. 

2.3.  Validation  of  the  Modal 

To  test  the  soundness  of  the  theoretical  phonetic  amb'guity  model,  the  log 
probabilities  from  the  theoretical  model  were  correlated  with  probabilities  derived 
from  the  ttakura  metric.  The  results  of  this  correlation  are  not  at  all  encouraging.  It  is 
not  sufficiently  accurate  to  be  of  use  at  the  nresent  time.  We  hope  to  improve  the 
model  over  the  next  few  years. 

Given  the  insufficiency  of  current  theoretical  models  and  the  problems 
associated  with  perceptual  data,  it  appears  that  Ihe  most  convenient  and  accurate 
estimators  cf  phonetic  ambiguity  are  the  acoustic-parametric  metrics. 


17 


3.  LEXICAL  AMBIGUITY 

In  this  chapter  we  present  a model  for  lexical  ambiguity.  The  model  utilizes  the 
Knowledge  of  phone-to  phone  confusions  from  chapter  2 and  a general  representation 
0*  the  ’.'ocahulaf y lo  estimate  the  probability  that  an  acoustic  realization  of  some 
sequence  of  idealized  phonemes  will  result  in  incorrect  recognizion.  The  average 
expected  number  of  words  retrie  /ed  in  an  syntactically  unconstrained  lexical  search  is 
computed  from  these  probabilities. 

3 1.  The  Nature  of  Lexical  Ambiguity 

Lexical  ambigu'.v  occurs  when  some  word  of  the  vocabulary  (lexicon)  is  confused 
with  another  word  because  the  two  are  phonetically  similar.  Thus,  "six"  and"sticks", 
being  phonetically  similar,  could  cause  a lexical  ambiguity  if  both  exist  in  the  same 
lexicon.  Syntax  may  be  useful  iri  resolving  this  ambigjily.  Syntactic  restrictions  will 
be  covered  in  later  chapters.  This  chapter  will  discuss  the  combinatorial  explosion 
expected  in  pure  bottom-up  approaches  as  a result  of  lexical  ambiguity. 

How  can  two  vocabularies  with  differing  phonetic  similarities  be  compared? 
Intuition  may  be  reasonable  for  small  vocabularies.  Consider,  for  example  the  two 
vocabularies: 

VI;  "a",  "b"&"c" 

and  V2:  "zero",  "nine"  & "seventeen" 

But  is  intuition  good  for  larger  vocabularies’  Does  intuition  help  in  comparing  a 
vocabulary  of  the  10  decimal  digits  and  the  26  letters  of  the  English  alphabet  with  a 
vocabulary  of  250  Japanese  place  names?  These  two  vocabularies  have  been 


18 


recognized  by  the  same  system[ltakura,  1975],  So  we  have  some  basis  for 
comparison.  In  this  case  the  alphabet  and  digits  were  recognized  with  88.67.  accuracy 
and  the  place  names  with  97.37  accuracy. 

The  problem  is  to  find  a measure  of  the  complexity  of  a vocabulary  so  that  two 
may  be  compared.  Briefly,  the  approach  is  to  view  the  recognition  process  as  a noisy 
channel  and  compute  the  information  loss  of  the  system.  Information  lost  is  a natural 
measure  of  the  ambiguity,  or  complexity,  of  the  system. 

3.2.  A Lexical  Ambiguity  Measure 

Figure  3-1  shows  the  block  diagram  of  an  information  channel.  There  are  r 
possible  input  symbols  which  may  be  chosen  from  alphabet  A and  s possible  output 
symbols  from  the  alphabet  0.  A channel  is  completely  described  by  its  channel  matrix. 
This  mctrix  consists  of  the  set  of  conditional  probabilities  Pjj-Pfbj/aj)  for  all  i and  j, 
where  Pjj  is  the  probability  that  output  symbol  bj  is  recognized  when  the  input  symbol 
8j  was  spoken.  In  the  context  of  word  recognition,  r«s,  the  input  symbol  represents 
the  word  spoken  and  the  output  symbol  is  the  word  recognized.  An  example  of  a 
channel  matrix  for  the  first  three  spoken  letters  of  the  alphabet  is  shown  below. 


There  are  several  important  relationships  among  these  probabilities.  If  some 
word  a|  is  spoken,  then  there  is  always  some  output.  Thus, 

I^P^bj/aj)"  1 i-l,2,...r 

Let  the  input  symbols  be  chosen  according  to  the  probabilities  P(aj),  P<2>,  . . . 


20 


P(a^).  These  are  refered  to  as  the  a priori  probabilities  of  the  input  symbols.  Then 
the  output  symbols  will  appear  according  to  some  set  of  probabilities  P(bj),  P(b2),  . . . 
P(bj.).  The  dependency  between  these  two  distributions  is  given  by 
P(bj)-  P(a,)  Rb^/aj) 

The  probabililies  Ptbj/aj)  used  to  describe  a channel  are  called  the  forward 
probabilities.  The  backward  probabilities  P(a,/bj)  may  derived  using  Bayes’  Law  as 

P(bj/a|)»P(aj) 

Raj/bj)  - Ra,,b;)*Rai)  i 

Rbj) 

Where  P(aj,bj)  is  the  probability  the  joint  event  (aj,bj).  These  Pfaj/bj)  are 
also  called  the  n posteriori  conditional  probabilities  of  the  input  symbols. 

We  will  next  discuss  information  quantities  relating  to  the  channel  model.  The 
information  received  when  a,  is  spoken  and  bj  is  recognized  is[Goldman,  1953] 

Kajibj)  - log 

-log[  Rai/b|)/Raj)  ] 

The  exponent  for  the  log  function  is  arbitrary  and  defines  the  information  units. 
An  exponent  of  2 will  be  used  throughout  this  thesis.  Thus,  information  is  measured  in 
bits.  If  the  channel  is  perfect,  then  P(a,/b,)  - 1 for  all  i,  and  the  information  per 
message  is 

H(aj)  - - log[  Raj)  ] 

The  average  informaiion  per  message  is  the  average  I(aj;bj)  over  all  events 

^a|,bj). 


a posteriori  probability  that  aj  was 
spoken  given  that  bj  was  recognized 

a priori  probability  that  aj 
was  spoken 


21 


- -2a,b  P(aj,bj)  log  [ P(8j)  ] 

- -Ia  P<V  log  P[  P<aj)  ] 

This  quantity  is  the  average  information  transmitted.  It  is  also  called  the  o priori 
uncertainty  of  the  input  alphabet.  Note  that  it  depends  only  on  the  a priori 
probabilities,  if  each  input  symbol  is  equally  probable,  then  P(a)-l/r  and 

KA)  - log  r bits/symbol 

H(A)  is  the  average  number  of  bits  necessary  to  specify  a symbol  of  the 
alphabet. 


The  average  information  received  at  the  output  of  an  imperfect  channel  is 
I(A;0)  - Ia,0 


" ^A,0 


- - Ia,b  P(8i)  'og  t P<»i>  1 * ^A.0  t ^ 

- - Za  'oe  t P<*i>  ^ * ^A,0  P<*i-^j)  [ P<ai/bj)  ] 

- H(A)  + Za,0  P(aj,bj)  log  [ P(3j/bj)  ] 


Rewriting 

H(A)  - i(A;0)  - - Za,0  P(aj.bj)  log  [ P(aj/bj)  ] 

Written  this  way,  we  see  that  the  right  hand  side  is  equal  to  the  information 
transmitted  minus  the  information  received.  This  quantity,  call  the  equivocation  and 
denoted  H(A/0),  represents  the  information  lost  in  the  channel. 

H(A/0)  - Za,0  P(aj/bj)  P(bj)  log[  P(a,/bj)  ] 

- - Z0  P(bj)  ZAP(aj/bj)  log[  Raj/bj)  ] (3-1) 


22 


H(A/B)  is  the  average  number  of  bits  necessary  to  specify  an  input  symbol  after 
examining  the  output  Recalling  that  2*^^^  measures  the  actual  size  of  the  vocabulary, 
consider  This  quantity,  which  we  call  the  equivalent  vocabulary  size,  or  EVS, 

is  a measure  of  the  size  of  ti^  vocabulary  given  the  loss  due  to  ambiguity  in  the 
vocabulary. 

For  perfect  recognition  , H(A/B)-0  and  the  EVS  is  1 word.  This  occurs  when 
Pjj*=l  and  Pjj“0  for  stated  another  way,  when  every  word  is  phonetically 
unambiguous.  At  the  opposite  extreme,  every  word  is  phonetically  identical  to  every 
other  word.  Then  H(A/B)»H(A)  and  the  information  received  is  0.  In  this  case,  2^^^ 
bits  are  required  to  represent  an  input  symbol  after  examining  the  output.  If  each 
symbol  is  equally  probable  the  interpretation  is  that  the  best  one  could  do  would  be  to 
make  a guess  from  among  the  r possible  words. 

Only  the  probabilities  P(aj/bj)  and  required  to  calculate  the  EVS  of  a vocabulary. 
We  will  now  discuss  how  these  may  be  obtained. 

3.3.  Word  Ambiguity  Model 

The  natural  method  for  obtaining  the  conditional  probabilities  would  be  to  take 
actual  counts  using  some  existing  system.  The  same  problems  exist  here  as  for  the 
phone-phone  probabilities  in  chapter  2.  To  repeat,  they  require  large  amounts  of 
carefully  selected  data  for  accurate  estimates  and  the  data  will  be  biased  by  the 
idiosyncrasies  of  the  system  used  to  gather  the  data.  There  are  methods  of  obtaining 
this  data  which  are  more  feasible.  We  have  investigated  three  method><. 


23 


Ml:  matches  network  representations  against  other 
network  representations  using  ttakura’s  metric  for 
log  probabilities. 


M2:  matching  network  representations  against  acoustic 
realizations  using  Harpy  witn  the  tiakura 
metric[Lowerre,  1976]. 


M3:  matching  acoustic  realizations  against  acoustic, 
realizations  using  ttakuras  recognition 
£Cheme[Itakura,  1975]. 

The  last  two  methods  are  recognition  systems  which  result  in  a set  cf  conditional 
probabilities  P(a|/sj)  for  words  3j  given  acoustic  signal  Sj. 

The  first  method  requires  a model  for  matching  network  representations.  The 
model  chosen  is  general  an-^  is  as  independent  of  any  particular  recognition  scheme  as 
possible.  It  performs  worst  case  analysis  in  that  it  finds  tfie  match  which  maximizes 
the  probability  of  confusion.  The  next  sections  discusses  this  model  in  more  detail. 

The  phonetic  definition  of  each  word  in  the  vocabulary  is  embodied  in  a finite 
state  recognition  network  similar  to  the  networks  used  by  HARPY[Lowerre,  1976].  An 
example  of  a recognition  network  for  "A"  and  "0"  is  shown  in  Figure  3-2.  The  network 
contains  an  initial  state  Sq  and  a final  state  S(.  Every  allowed  variation  of  a word  of 
the  vocabulai  / is  represented  by  a subnetwork  starting  at  Sq  and  ending  at  Sf.  Each 
subnetwork  is  buffered  at  the  beginning  and  end  by  an  optional  silence  phone  so 
that  initial  and  final  stops  and  fricatives  have  a context  which  they  may  match.  Each 
state  of  the  network  contains  the  phone  label  representing  that  state.  Let  this  be 
calted  PHNOF(S);  for  instance,  PHN0F(S2)”EH.  In  addition,  there  is  a word  associated  to 


25 


every  state.  This  has  been  omitted  for  clarity.  Let  this  correspondence  be  given  by 
WORDOF(State).  Thus,  WORDOF(S2>*"A”.  For  each  state,  the  set  of  immediately 
previous  states  is  denoted  PREV(S).  For  example,  PREV(S2)“{Sq,S|  L The  set  f(W)  is 
all  final  states  of  word  W: 

f(W)  - {s  I s < PREV(Sf)  and  WORDOF(s)  « W) 

For  word  "A",  i("A")  = {S3,S/j,Sg}. 

From  such  a word  network  and  the  phone-phone  probabilities,  v/ord-to-word 

confusion  probabilities  are  calculated.  This  is  done  by  first  computing  state-to-state 

confusion  probabilities  P(Sj/Sj).  Then,  word-to-word  probabilities  are  extracted  using 

P(Wj/W.)  - MAX  PfSj/Sj) 

Sj<f(Wj) 

S|  < f(W2> 

Since  these  relative  probabilities  are  maximized,  they  do  not  in  general  sum  to 
one.  They  must  be  normalized  so  that 

^j=l,r 

The  effective  vocabulary  size  defined  in  the  previous  section  is  then  computed 
from  this  matrix  using  equation  3-1  (page  21).  This  brief  description  serves  as  a guide 
to  the  discussion  of  the  next  section. 

The  flow  diagram  for  the  computation  of  state  confusion  probabilities  is  shown  in 
figure  3-3.  In  this  algorithm,  all  probabilities  have  been  replaced  with  their  logs  so 
that  multiplications  become  additions.  For  each  word  W,  the  probabilities  P(Wj/W)  are 
found.  Given  a word  W,  a partial  order  exists  for  the  states  in  its  subnetwork.  For 
example,  the  partial  order  for  "A"  is  (SQ,Sj,S2,S3iS5,S^,Sf).  This  partial  order 
determines  the  order  in  which  the  calculations  proceed.  First,  P(Sq/Sq)  is  set  to 


27 


log(l)-0.  This  may  bo  inlerpreted  as:  the  probability  of  being  in  state  Sq  given  that 
you  should  be  in  state  Sq  is  1.  The  computation  then  proceeds  using  the  recursive 
formula: 

P,^(Sj/S,^)  " MAX  P,^.i(070)  + PHNPRB[PHNOF(0),  PHNOF(Q’)] 

0’  ( !PREV(Sj)  U S,1 
0 < PREV(S,^) 

The  subscripts  on  P are  redundant,  but  serve  to  emphasize  that  the  probabilities 
on  each  side  of  the  equation  are  separate  quantities.  Figure  3-A  helps  to  interpret 
this  equation  as  follows:  The  first  term  on  the  right  represents  the  maximum  of  the 
probabilities  of  being  in  previous  states  of  S,  given  that  the  correct  state  should  have 
been  some  previous  state  of  S)^.  Added  to  this  is  the  (log)  probability  of 
misrecognizing  the  acoustics  as  PHNOF(O’)  given  that  PHNOFfQ)  was  spoken.  The  result 
is  the  probability  of  being  in  state  S,  given  that  the  correct  recognition  would  lead  to 
state  Sg.  Allowing  Q’  to  be  Sj  serves  two  purposes.  First,  sequences  of  phones  may 
match  a single  phone.  For  example,  consonantal  clusters  may  match  a single  consonant. 
Or,  as  in  the  example  shown,  the  diphthong  <EH  IH>  to  match  the  vowel  <1Y>.  Secondly, 
it  may  happen  that  the  best  match  occurs  before  S|^=Sj.  This  would  be  true  when  W 
ended  with  a stop  consonant  which  matched  the  optional  silence  of  another  word. 
Since  from  then  on.  PHNPRB[PHNOF(),PHNOF()]=0,  the  self  cycling  nature  of  the 
definition  will  retain  the  maximum  match  until 

3.4.  Interpretation  of  Results 

Figure  3-5  lists  the  results  of  lexical  analysis  for  several  vocabularies.  Recall 
that  an  equivalent  vocabulary  size  of  1 indicates  no  information  is  lost  in  the  channel 
and  thus  recognition  is  perfect.  The  first  three  vocabularies,  ABC,  BDV  and  V123  are 
the  vocabularies  introduced  as  intuitive  exercises  in  chapter  1.  We  see  that  BDV  is 


Figure  3-4.  Calculation  of  P (S^  / ). 


Number  of  Equivalent 

Uords  in  Vocabulary 

Task  Vocabulary  Size 

ABC  3 1-19 

BOV  3 1 • 99 

V123  3 1-03 

PHONES  33  20.10 

DIGITS  10 

ALPHABET  2G  3.87 

ALPHA-OIG  3G  9.41 

CHESS  25  1.^6 

Lincoln  Labs 

Basic  23G  2.43 

Extended  410  3.o4 

IBfl  250  2.31 

LIZARD  17  1-55 

VP  37  1.70 


Figure  3-5.  Results  of  Lexical  Ambiguity  Analysis. 


30 


obviously  the  most  ambiguous  of  the  three.  The  fact  that  ABC  and  VI 23  are  so  close 
in  difficulty  n.?v  be  a slight  surprise.  The  phones  are  highly  ambiguous,  as  expected. 
The  10  digits  have  an  equivalent  vocabulary  size  of  1.19  words  while  the  equivalent 
size  of  the  spoken  alphabet  is  3.87  words. 

Consider  now,  the  other  vocabularies.  Their  real  sizes  range  form  17  to  AlO 
words  while  the  effective  sizes  range  between  1.46  and  3.54  They  seem  to  be 
directly  related.  This  relative  o'^der  is  expected  since  large  vocabularies  have  greater 
potential  for  ambiguity  ond  therefore,  in  general,  have  larger  effective  sizes.  An 
interesting  comparison  can  be  made  between  the  chess  vocabulary  and  the  Lizard 
vocabulary.  In  this  case,  the  17  words  of  the  Lizard  vocabulary  have  slightly  higher 
ambiguity  than  the  25  words  of  the  chess  task. 


31 


4.  SYNTACTIC  RESTRICTION  IN  SPEECH  UNDERSTANDING  TASKS 

The  syntax  of  languages  tor  speech  understanding  systems  impose  restrictions 
on  the  number  of  word  pairs,  triples,  etc.  which  can  occur  in  the  language.  These 
limitations  can  dramalicatly  reduce  the  total  size  of  the  search  space.  This  chapter 
discusses  the  effects  of  syntactic  restriction  without  regard  to  the  similarity  of  the 
words  involved  The  co.nbined  effects  ot  vocabulary  and  syntax  are  examined  in 
chapters  5 and  6. 

4 1.  Measures  of  Grammatical  Complexity 

Some  measures  ot  grammar  size  which  have  been  used  are  the  the  number  of 
non-terminals  and  the  number  ot  productionsfright  hand  sides)  in  the  grammar.  There 
are,  in  general,  many  ways  do  define  a particular  language.  Thus,  these  are  only  very 
gross  measures  m ttial  they  represent  the  complexity  of  the  representation  of  the 
grammar  as  opposed  to  the  complexity  of  the  grammar  or  syntax  itself.  Better 
measures  tor  quanlitying  complexity  are  the  number  ot  pairs  ot  words  that  may  occur 
together  and  the  number  of  word  triples  that  may  occur  in  language.  Pairs  and  triples 
give  some  idea  how  syntax  restricts  the  search  space,  but  fall  short  in  two  aspects. 
First,  they  account  for  local  context  only;  that  is  they  consider  at  most  the  preceding 
and  following  words.  Secondly,  they  say  nothing  about  the  probabilities  with  which 
they  occur.  In  this  chapter,  average  branching  factor  will  be  discussed  as  a measure 
of  syntactic  restriction.  Average  branching  taclor(ABF)  is  defined  as  the  expected 
number  of  words  whicli  may  occur  next  in  an  utterance.  Two  methods  of  averaging 
will  be  presented,  resulting  in  two  types  of  A0F.  Static  average  branching  factor  is 
the  result  ot  averaging  uniformly  over  all  possible  states  of  recognition.  Dynamic 


32 


branching  factor  it  computed  similarity,  but  includes  the  probabilities  of  being  in  the 
states.  Thus,  states  which  are  rarely  visited  do  not  contribute  as  much  as  those  which 
occur  often,  such  as  those  that  occur  in  every  sentence.  While  computing  the  average 
branching  factor,  maximum  and  minimum  branching  factors  are  also  found.  For 
completeness  and  comparison,  all  quantities  mentioned  above  are  tabulated  for  the 
languages  investigated.  Fundamental  to  the  computations  is  the  method  of 
representing  the  syntax. 

The  initial  representation  for  a grammar  is  its  Backus  Normal  Form  or  Backus- 
Naur  Form(BNF)  definition.  An  example  of  a BNF  is  shown  in  Figure  4 -1  for  the  very 
s mple  task  called  APEX.  This  example  will  be  used  to  illustrate  the  concepts 
presented  this  chapter.  This  task  is  not  typical!  see  appendix  C),  but  is  purposefully 
small  so  that  important  ideas  may  be  presented  clearly.  This  BNF  is  transformed  into  a 
probabilistic  grammar  network.  Recognition  networks  of  this  form  have  been  studied 
by  several  investigators  [Fu  1969,  Woods  1970,  Baker  1975,  Lowerre  1976]. 
Recreating  previous  work  at  this  level  was  deemed  unnecessary  and  unjustified.  Thus, 
we  chose  to  utilize  the  network  representation  used  by  the  Dragon  speech  recognition 
system[Baker,1975]  and  later  modified  for  use  by  the  HARPY  system[Lowerre,l  976]. 
The  network  for  APEX  is  shown  in  figure  4-2.  In  this  figure,  each  box  represents  a 
state  of  partial  recognition  and  is  labelled  with  a state  number.  There  is  a special 
state  called  the  initial  state,  denoted  here  by  Sq-  Every  other  state  contains  a word 
from  the  vocabulary.  The  successors  for  each  state  are  indicated  by  arrows  in  the 
figure.  The  set  of  successors  for  state  is  denoted  by  NEXT(S)^).  If  NEXT(s)  is 
empty,  the  state  s is  called  a final  state,  in  similar  fashion,  the  set  of  predecessors  is 
represented  by  the  function  PREV(s).  By  "being  in  .a  state"  we  mean  that  some  partial 
recognition  has  led  to  the  state  after  recognition  of  the  word  of  the  state.  Loops  are 


33 


QUERY>::- 

[ <REQUEST>  ] 

<REQUEST>!!- 

HELLO 

GIVE  <G1VE> 

<GIVE>::- 

nORE 

EVERYTHING 
ME  <NOUN-PHRASE> 

<NOUN-PHRASE>::'’ 

EVERYTHING 
THE  <NCUN> 

<N0UN>;:- 

NEUS 

SUfiriARY 

STORIES 

Figure  4-1.  BNF  definition  for  the  example  APEX. 


Figure  4-2.  Example  of  a Grammar  Network. 


35 


possible  in  this  representation,  although  they  do  not  occur  in  the  specific  example.  If 
is  clear  thaf  any  regularffinite  state)  language  can  be  represented  by  such  a network. 
Since  most  speech  recognition  tasks  have  been  defined  in  terms  of  regular  languages, 
this  does  not  represent  a severe  restriction.  Furthermore,  languages  for  speech 
recognition  are  designed  to  describe  a large  but  finite  number  of  sentences;  it  is  an 
artifact  that  they  also  define  sentences  of  infinite  length.  Thus,  one  could,  in  general, 
redefine  the  language  by  describing  all  the  sentences  of  interest  using  finite-state 
grammars.  In  fact,  very  simple  transformations  allow  this  to  be  done.  The  Lizard 
task[appendix  C|,  for  example,  contains  phrases  which  could  be  defined  by 

<SIMPLE-EXPRE>::-<PRIMARY>  <B1N  0PE>  <PRIMARY> 

By  defining  the  non-terminals  <PRIMARYCE>  and  <PRIMARYDE>,  each  with  identical  but 
separate  definitions,  this  may  be  rewritten  as 

<SIMPLE-EXPRE>;:=<PRIMARYCE>  <B1N-0PE>  <PR1MARYDE> 

This  transformation  has  preserved  context  by  duplicating  a non-terminal  which 
occurred  in  different  contexts.  Each  of  the  tasks  investigated  was  found  capable  of 
being  made  regular  by  this  method. 

Given  the  BNF  a grammar  network,  the  simple  measures  can  be  found  quickly. 
The  number  of  non-terminals  ar,d  productions  is  determined  by  counting  these 
quantities  directly  from  the  BNF.  All  possible  word  pairs  (and  triples)  may  be  obtained 
by  considering  each  state  in  the  network  and  its  possible  successors  (and 
predecessors).  A complete  list  of  the  word  sequences  for  the  example  APEX  is  given 
in  figure  4-3.  These  four  simple  measures  are  summarized  for  all  the  tasks  in  figure 
4-4.  While  these  quantities  are  useful,  a more  revealing  quantity  is  the  average 
branching  factor. 


36 


Number  of  Uord  Pairs 
16 


GIVE 

MORE 

GIVE 

EVERYTHING 

GIVE 

ME 

riE 

EVERYTHING 

ME 

THE 

THE 

NEUS 

THE 

SUMMARY 

THE 

STORIES 

HELLO 

tt 

MORE 

ti 

EVERYTHING  » 

NEUS 

tt 

SUMMARY 

tt 

STORIES 

tt 

U 

GIVE 

ti 

HELLO 

Figure  A -3.  Word  Sequences 


Number  of  Uord  Triples 
15 


GIVE 

ME 

EVERYTHING 

GIVE 

ME 

THE 

ME 

THE 

NEUS 

ME 

THE 

SUMMARY 

ME 

THE 

STORIES 

GIVE 

MORE 

ti 

GIVE 

EVERYTHING  ti 

ME 

EVERYTHING  ti 

THE 

NEUS 

ti 

THE 

SUMMARY 

ti 

THE 

STORIES 

ti 

ti 

GIVE 

EVERYTHING 

ti 

GIVE 

ME 

ti 

GIVE 

MORE 

ti 

HELLO 

tt 

for  the  Example  APEX. 


37 


TASt^ 

NNT 

NPS 

PAIRS 

P/UORD 

TRIPLES 

T/UORO 

CHESS 

33 

84 

207 

7.9B 

23B2 

94.48 

LIZ 

8 

34 

182 

10.71 

18GB 

109. 7B 

VP 

41 

181 

B22 

IB. 81 

10.152 

27B.37 

IBM 

38 

314 

2304 

3.22 

22,004 

88.02 

LLBAS 

127 

391 

2B17 

11.09 

47,219 

200.08 

LLEXT 

1G3 

B79 

10.28B 

25.03 

5BG,B33 

1382.03 

NNT  is  the  number  of  Non-terminals. 

NPS  is  the  number  of  Productions. 

PAIRS  is  the  number  word  pairs. 

P/UORD  is  the  number  of  word  pairs/uord. 
TRIPLES  is  the  number  of  word  triples. 
T/UORO  is  the  number  of  word  tr iples/uord. 


Figure  A-4 


Some  simple  measures  of  grammatical  complexity. 


38 


Two  methods  of  averaging  are  defined  yielding  a stalic  average  branching  factor 
(SABF)  and  a dynamic  average  b'^anching  factcr(DABF).  Let  BR(s)  be  the  local 
branching  factor  for  the  state  s.  BR(s)  is  the  number  of  states  in  NEXT(s).  SABF  is 
BR(s)  averaged  over  all  non-final  states.  Define 

NFS  • { s I NEXTX(s)  is  not  EMPTY  } 

Then 

^ s ( NFS 

SABF 

I NFS  I 

While  finding  this  average,  maximum  and  minimum  branching  factors  are  also 
found.  The  result  of  this  calculation  for  the  example  of  this  chapter  is 

Average  Branching  Factor  - 2.5 

Maximum  Branching  Factor  ■ 30 

Minimum  Branching  Factor  - 2.0 

4.2.  Dynamic  Branching  Factor  • A Maatura  of  Syntactic  Raitriction 

The  static  method  of  averaging  does  not  account  for  the  fact  that  the  sentence 
"Hello"  may  occur  fewer  times  than  sentences  described  by  the  other  paths.  This  may 
be  done  by  assigning  transition  probabilities  to  each  arc  in  the  network.  These 
probabilities  represent  the  relative  frequencies  of  the  alternative  paths  at  each  state. 
The  transition  probabilities  un  the  arcs  leading  from  each  state,  say  s,  to  the  set  of 
next  slates  sum  to  one. 

Figure  4-5  shows  the  APEX  network  with  these  transition  probabilities  placed  on 
the  arcs.  Let  P(s/r)  be  the  probability  of  going  to  state  s given  current  state  r.  From 
those  transition  probabilities  we  calculate  P(s/t),  the  probability  of  being  in  state  s at 


40 


time  t.  Time  is  measured  in  words  in  this  case.  These  probabilities  are  defined 
recursively  by 

Assign:  P(sq/0)«1 

Define:  P(s/t)  - R(s/r)  P(r/t-l) 

Figure  4-5  shows  these  probabilities  for  all  states  and  times  which  result  in  non-zero 
probabilities. 

Average  sentence  length(ASL)  is  a simple  sum  of  these  state/time  probabilities 
over  all  non-final  states  and  time. 

ASL  - Ij  I 5 ( P(s/t) 

Thus,  the  average  sentence  length  for  this  example  is  1.0^.8+..64+.32-  2.76 
words/sentence. 

Dynamic  branching  factor  may  now  be  defined  as  follows.  First,  find  the  sums  of 
the  log  of  the  local  branching  factors  probabilistically  weighted  and  averaged  over  all 
time.  That  is, 

^T  ^ s ( NFS  '08  t J 
LWS 

^T  ^ s ( NFS 

Note  that  the  denominator  in  the  above  expression  is  simply  the  average  sentence 
length.  Dynamic  branching  factor  is  then  2 to  the  exponent  LWS. 

DABF  = 2'-'^^ 


Transition  probabilities  are  necessary  for  the  computation  of  dynamic  branching 
factor  and  average  sentence  length.  These  probabilities  vary  depending  on  the  users 


41 


preferences  and  the  particular  problem  he  ,s  trymg  to  solve  in  the  tasK  domain. 
Learning  these  probabilities  is  a current  topic  of  research  in  speech  recognition[Bahl 
et.  al„  1976]  For  the  purposes  of  computation,  the  transition  probabilities  have  been 

chosen  such  that 

P(s,r)  - 1/K 

where  K = 1 NEXT(s)  I s < NFS 

I NEXT(s)  I + 2 otherwise 


The  probability  of  moving  to  a new  stale  is  roughly  uniform  over  all  possible  next 
states,  with  some  preference  to  ternr.ination  if  the  current  state  is  a final  state.  This 
distribution  assigns  slightly  higher  weight-,  to  the  shorter  sentences. 


The  example  presetted  in  this  chapter  was  not  recursive  and,  therefore,  the  sum 
over  time  terminated  properly.  !n  the  case  of  recursion,  some  stopping  criteria  is 
necessary.  Since  the  computation  time  is  not  excessive  in  this  calculation,  a very  loose 
constraint  ,s  used.  The  computation  is  stopped  whenever  all  sentences  of  length  100 
or  less  have  been  considered(t  = 100)  or  whenever  the  probability  of  remaining 
sentences  falls  below  O.OOOOl,  whichever  comes  first.  In  the  tasks  examined,  the  only 
task  which  went  to  sentences  of  length  100  was  the  Ches^  task.  The  residual 
nrobahilitv  of  sentences  of  greater  length  was  .000011  m this  case. 


Average  branching  factors  and  sentence  lengths  ar.  summarized  in  Figure  4-6. 
This  tal-  e contains  the  average,  maximum  and  minimum  static  branching  factors,  the 
dynamic  branclm.g  factor  and  the  c erage  sentence  length.  The  dynamic  branching 
factor  from  figure  4-6  assigns  a relative  ordering  to  the  complexity  of  the  tasks.  That 
order  is  CHESS,  IDM,  LLBAS.  LIZ.  VP  and  LLEXT,  Static  branching  factor  yields  the 
ordering  CHESS.  IBM,  LIZ,  LLBAS.  VP  and  LLEXT.  Maximum  branching  factor  yields 


42 


TASK 

STATIC 

AVE 

BRANCHING  FACTOR 
HAX  niN 

OYNAniC 

BRANCHING 

FACTOR 

AVERAGE 

SENTENCE 

LENGTH 

CHESS 

8.G5 

21 

1 

7.3G 

8.10 

LIZ 

10.78 

11 

G 

9.32 

G.08 

VP 

14.11 

37 

1 

10.82 

8.22 

IBM 

10.58 

24 

1 

7.73 

8.09 

LLBAS 

11.34 

G1 

1 

9.15 

7.52 

LLEXT 

25.32 

IGl 

1 

20.28 

8.93 

Figure  4-G.  Branching  factors  for  the  Tasks  Studied. 


43 


much  the  some  order  except  for  the  Ltrard  task.  Minimum  branching  factor  gives  little 
information  since  it  is  usually  one.  In  all  cases,  the  dynamic  branching  factor  is  lower 
than  the  static  branching  factor,  This  is  true  because  the  log  function  gives  higher 
weights  to  small  branching  factors  and,  in  general,  the  larger  local  branching  factors 
arc  found  in  the  longer,  and  therefore  less  probable,  sentences.  The  Lizard  task  has 
the  lowest  average  sentence  length;  6.08  words  per  sentence.  The  average  sentence 
length  falls  between  7.5  and  9 words  for  the  other  tasks.  It  is  interesting  that  the 
chess  task  has  the  lowest  branching  (actor  and  the  one  of  the  largest  average 
sentence  lengths.  This  means  that  individual  decisions  are  easier,  but  there  are  more 
decisions  to  be  made.  This  leads  to  the  notion  of  search  space. 

4.3.  Syntactic  Search  Space 

The  average  branching  factor  described  above  is  a local  measure  of  complexify 
It  represents  the  degree  of  difficulty  ol  making  individual  decisions  The  total  size  of 
the  search  space  is  a global  measure  of  the  complexity.  The  syntactic  search  space  is 
the  size  c a ('■ee  with  tnis  average  branching  factor  and  hav'ng  depth  ec^ua!  to  the 
average  sentence  length.  Since  this  number  would  be  quite  large,  it  is  more 
convenient  to  use  the  log  of  this  quantity.  Thus, 

log[Scarch  Space  Size]  “ ASL  » log[  DABF  ] 

- ASL  » LWS 

The  log  of  the  search  space  size  for  each  of  the  tasks  under  consideration  is  given  in 
Figure  4-7.  This  number  is,  roughly,  the  number  of  binary  decisions  necessary  to 
recognize  a sentence  (considering  syntax  only).  The  relative  ordering  is  now  LIZ,  IBM, 
CHESS,  LLDAS,  VP  and  LLFXT.  Lizard  has  moved  dow.i  because  there  are  fewer 
decisions,  on  the  average.  Chess  has  moved  up  high  in  the  ranking  because  of  its  long 


TASK 


logt  Search  Space  Size] 


CHESS 

LIZ 

VP 

IBM 

LLBAS 

LLEXT 


Figure  4-7.  Log  of 


23.31 
19. 5G 
28.24 
23.23 
24.81 
38.79 


Search  Space  Size. 


45 


sentence  length.  In  practice,  the  average  sentence  length  for  the  chess  task  is  on  the 
orcier  of  6 or  7 words.  This  is  probably  due  to  the  "principle  of  least  effort".  In  the 
Chess  task,  moves  may  be  said  in  a variety  of  ways  and  people  will  usually  opt  for  the 
smallest  unambiguous  sentence.  Independer'it  estimates  of  the  average  sentence  length 
could  be  used  for  this  calculation,  if  they  were  available. 


46 


5.  COMPLEXITY  IN  CONNECTED  SPEECH  - A RESTRICTED  MODEL 

Chapter  4 discussed  how  syntax  restricts  the  number  of  word  combinations 
allowed  in  the  language.  Further  restriction  is  {rossible  when  the  syntax  eliminates 
confusable  words  from  appearing  within  the  same  context.  This  chapter  examine,  the 
combined  eff'.*Js  of  vocabulary  and  syntax  for  connected  speech  in  a restricted  model. 
A general  model  for  connected  speech  is  presented  in  chapter  6. 

The  model  used  in  this  chapter  assumes  that  the  recognition  process  is  "well 
behaved"  in  the  sense  that  it  proceeds  almost  entirely  without  error.  That  is,  each 
word  of  the  utterance  is  assumed  to  have  been  recognized  correctly  as  the  process 
moves  from  one  correct  state  to  another,  The  model  therefore  measures  the  average 
ambiguity  encountered  during  a correct  recognition.  Another  view  is  that  this  is  a 
model  for  ambiguity  in  pause  separated  speech.  We  will  refer  to  this  as  the  "best" 
cese  model. 

5.1.  Lexical  Ambiguil'/  and  SyntKtic  Restriction 

In  chapter  4 the  calculation  of  dynamic  branching  factor  used  the  log  of  the  local 
branching  f.Ktor  as  the  quantity  which  was  averaged.  This  may  be  interpreted  to 
mean  that  local  alternatives  are  viewed  as  a set  of  entirely  confusabif  words.  This  is 
never  true  and,  in  fact,  a well  designed  language  will  use  the  syntax  to  place 
acoustically  similar  words  in  different  contexts.  Figure  5-1  gives  the  BNF  description 
of  the  Lizard  task  language.  The  word  pair  having  highest  acoustic  similarity  in  this 
task  is  "ADD"  - "EIGHT".  Figure  5-2  shows  the  initial  state  of  the  Lizard  grammar 
network  along  with  its  successors.  Define  the  si  b-vocabulary  of  a state  s to  be  the 


47 


<UTT>::« 

[<coririAND>] 

<COnriAND>: : - 

<OPxS  I GN -NUMBER  > 
DISPLAY 

<OF>: : - 

ADO 

subtract 

MULTIPLY 

DIVIDE 

LO.'.D 

<SIGN-NUHBER>::- 

MINUS  <NUMBER> 
<NUMBER> 

-NUriBER>::- 

<DIGIT> 

<DIGITxNUMBER-2> 

<niGI 

ZERO 

ONE 

TUO 

THREE 

FOUR 

five 

SIX 

SEVEN 

EIGHT 

NINE 


<NlinBER-2>;:«  <DIGIT-2> 

<DIGIT-2><NHM0ER> 

<DIGIT-2>::-  ZERO 

ONE 

TUO 

THREE 

FOUR 

FIVE 

SIX 

S^'/EN 

cIGHT 

NINE 


Figure  5-1.  BNF  Description  for  the  Lizard  Task. 


49 


set  of  words  determined  by  the  successors  of  state  s.  The  only  sub-vocabulary 
containing  the  word  "ADO"  is  the  sub-vocabulary  of  the  initial  state.  Note  that  if  does 
not  contain  the  word  "EIGHT"  The  syntax  has  isolated  these  two  words  from  one 
another  in  such  a way  that  they  would  never  cause  an  ambiguity,  assuming  that  no 
errors  have  yet  occurred.  In  this  particular  example,  the  only  time  these  two  words 
could  be  confused  would  be  if  the  beginning  word  "ADD"  was  misrecogni?ed  as  silence 
and  the  second  word  of  the  utterance  was  "EIGHT".  This  may  happen  if  "ADD"  were 
reduced  or  swallowed,  the  speech/no  speech  detector  failed,  or,  more  likely,  the  words 
were  run  together  so  that  the  <D>  went  undetected  and  the  two  vowels  were 
missegmented  as  one  vowel.  If  an  error  of  this  nature  was  made,  then  "EIGHT"  could 
easily  be  misrecognized  as  "ADO".  Such  problems  will  be  addressed  in  chapter  6. 

5.2.  Ambiguity  Analysis  in  the  Restricted  Model 

To  combine  the  effects  of  vocabulary  restriction  and  syntactic  restriction,  the 
branching  factor  is  replaced  with  the  effective  branching  tactor(equivalent  vocabulary 
size)  for  the  sub-vocabularies  of  each  state  in  the  calculations  performed  in  chapter  4. 
The  effective  branching  factor  for  sub-vocabularies  is  computed  in  the  same  way 
equivalent  vocabulary  size  w»s  computed  in  chapter  3.  Recall  that  the  effective 
vocabulary  size  for  the  Lizard  vocabulary  was  1,55  words.  In  the  exampte  of  figure  5- 
2,  the  locat  branching  factor  is  6 and  the  etfective  branching  factor  is  1.16. 

The  branching  factors  computed  in  chapters  3,  4 and  this  chapter  are  tabulated 
in  figure  5-3.  The  first  two  columns  of  this  table  contain  the  task  name  and  the 
number  of  words  in  the  vocabulary  of  the  task.  The  columns  to  the  right  contain 
branching  factors  under  various  conditions.  The  effective  branching  factor  for  the 
vocabulary,  without  '.he  effects  of  syntactic  restriction,  is  shown  the  column  labeled 


BRANCHING  FACTORS 


Number  of 

Vocabulary 

Uords  in 

Vocabulary 

Grammar 

and 

Task 

Vocabulary 

Only 

Only 

Grammar 

PHONES 

33 

20.10 

33 

20.10 

DIGITS 

10 

1.19 

10 

1.19 

ALPHABET 

2G 

3.87 

26 

3.87 

ALPHA-DIG 

3G 

3.41 

3G 

3.41 

CHESS 

25 

1.4B 

7.36 

1.09 

Lincoln  Labs 

Basic 

23G 

2.43 

9.15 

1.20 

Extended  410 

3.54 

20.28 

1.34 

IBM 

250 

2.31 

7.32 

1.09 

LIZARD 

17 

1.55 

9.32 

1.4G 

VP 

37 

1.70 

10.82 

1.28 

VPNS 

37 

1.70 

37.00 

1.70 

Figure  5-3.  Results  of  Complexity  Analysis. 


51 


"vocabulary  only"  If  is  the  same  as  the  effective  vocabulary  size  described  in  chapter 
3 and  represents  the  average  number  of  words  retrieved  in  a lexical  match  per  word 
spoken.  Thus,  for  the  10  digits,  1.19  words  would  appear,  on  the  average,  for  each 
word  spoken.  The  column  marked  "grammar  only"  gives  the  average  branching  factor 
considering  syntax,  but  disregarding  the  effects  of  texical  ambiguity.  This  branching 
factor,  described  in  chapter  4,  represents  the  average  fan-out  of  the  syntax;  or,  the 
average  number  of  words  which  may  follow  another  word  in  an  utterance.  This  column 
is  the  same  as  the  vocabulary  size  for  ihe  first  4 tasks  since  any  word  may  follow  any 
other  word.  For  the  tasks  with  syntactic  constraints,  this  branching  factor  ranges  from 
7.32  words  to  20  28  words,  The  last  column  contains  the  effective  branching  factor 
considering  the  combined  effects  of  lexical  ambiguity  and  syntactic  constraint. 

We  will  first  consider  the  tasks  in  order  and  then  general  aspects  of  the 
complete  table.  Recall  that  the  phone  task  vocabulary  was  just  the  set  of  phones.  The 
effective  vocabulary  size  obtained  is  20.  This  means  that  every  phone,  on  the 
average,  mat  h*>s  uniformly  to  20  phonetic  labels.  It  must  be  remembered  that  this  is 
for  isolated  n.nones  without  syntactic  support,  or  even  a surrounding  texical  context. 
Even  so,  this  value  seems  rather  high.  This  quantity  has  been  computed  from  actual 
counts  from  the  BBN  speech  recognition  system  [Makhoul,  1975].  The  value  for  their 
system,  which  uses  67  different  phoneme  types  and  83  acoustic  classifications,  is  4 
tabels/segment  If  this  figure  were  used  as  a standard,  it  says  that  the  computation  of 
t-KA/B)  is  roughly  two  and  one-half  times  larger  than  it  should  be.  If  anything,  this 
implies  that  our  model  accounts  for  more  variability  in  the  phones  than  is  really  there; 
that  is,  it  is  biased  away  from  high  quality,  well  articulated  speech.  We  intended  this 
to  be  the  nature  of  the  system.  Also,  bear  in  mind  that  the  models  were  designed  for 


relative  comparisons. 


52 


For  the  10  digits,  the  effective  vocabulary  sire  is  1.19.  The  interpretation  here 
is  that  six  words  will  be  retrieved  for  every  five  words  spoken  and  one  of  them  is 
obviously  wrong.  This  corresponds,  rougHy,  to  a recognition  rate  of  837..  Currently, 
speech  recognition  systems  have  very  litile  trouble  recognizing  the  digits  spoken  in 
isolation.  Again,  we  see  that  if  the  model  is  biased,  it  is  biased  toward  greater 
variability.  We  feel  that  this  is  actually  an  advantage  of  the  model;  for,  given  the 
relative  soundness  of  the  model,  the  differences  between  vocabularies  are  enhanced. 

The  spoken  alphabet  exhibits  an  effective  vocabulary  size  of  3.87  words.  This 
is  reasonable,  particularly  when  compared  to  the  digits,  since  the  spoken  alphabet  is 
highly  ambiguou':. 

In  the  alphabet  digit  vocabulary  we  see  the  effects  of  averaging.  Assuming 
equally  probable  choices  from  the  36  words,  a vocabulary  with  an  approximate 
recognition  rate  of  807  is  combined  with  one  whose  rate  is  267  in  the  ratios  10/36 
and  26/36  respectively.  This  gives  approximately  407  recognition  which  is  roughly 
equivalent  to  a branching  factor  of  2.5.  This  method  of  combining  branching  factors  is 
an  approximation,  valid  only  when  the  recognition  rates  are  near  1007.  (effective 
branching  factor  of  1)  and  there  is  no  inter-vocabulary  ambiguity.  There  are  inter- 
vocabulary ambiguities:  "two"  and  "u"  or  "three"  and  "g",  for  instance.  This  would 
account  for  the  effective  branching  factor  being  greater  than  predicted  from  the 
independent  results  for  the  two  vocabularies. 

Consider  now  the  tasks  having  syntactic  restriction.  The  number  of  words  in 
their  vocabularies  range  from  17  to  410  while  the  effective  sizes  range  between  1.46 
and  3.54.  They  seem  to  be  directly  related.  This  relative  ordering  is  expected  since 
large  vocabularies  have  greater  potential  for  ambiguity  and  therefore  would,  in 


53 


general,  have  larger  effective  sizes.  An  interesting  comparison  can  be  made  between 
the  chess  vocabulary  and  the  Lizard  vocabulary.  In  this  case,  the  17  words  of  the 
Lizard  vocabulary  have  slightly  higher  confusion  than  the  25  words  of  the  chess  task. 

One  reason  for  the  large  effective  vocabulary  size  of  the  Lincoln  Labs  Basic  task 
(9.15)  is  the  fact  that  it  contains  words  pairs  which  are  almost  identical;  such  as,  "to“- 
"two",  "recompute"-"recomputed"  and  "specfra"-"?pectrum".  This  points  to  a difficulty 
in  the  representation  of  a vocabulary.  Namely,  when  should  two  words  be  considered 
separate  entities.  In  the  "two-to"  case,  syntax  would  probablely  disambiguate  them 
and  the  analysis  procedures  would  treat  them  separately  when  considering  syntax.  If 
"spectra"  and  "spectrum"  appear  within  the  same  context  and  are  functionally 
differentiated,  they  must  remain  as  two  distinct  words.  On  the  other  hand,  if  they 
describe  the  same  semantic  notions,  then  the  ambiguity  is  not  one  of  real  concern. 

The  branching  factors  for  the  "grammar  only"  case  fall  into  the  range  7.32  to 
20.28.  We  see  that  syntactic  restriction  alone  has  nearly  equalized  the  difficultly  of 
the  Chess,  Lincoln  Labs  Basic  and  IBM  languages.  Lizard  and  VP  have  larger  branching 
factors,  even  though  they  have  fewer  words  in  their  vocabularies.  The  Lincoln  Labs 
extended  task  has  the  largest  branching  factor  of  syntactically  constrained  languages. 

Each  of  the  languages,  except  IBM's,  contain  the  numbers  in  one  form  or 
another.  In  the  Chess  task,  the  numbers  are  all  single  digits  indicating  rank  or  file.  In 
Lizard  and  VP  numbers  are  sequences  of  digits  of  indefinite  length.  They  occur  in 
every  sentence:  in  Lizard,  this  accounts  for  the  branching  factor  being  near  ten.  In  VP 
numbers  occur  in  approximately  757.  of  the  sentences. 

In  the  Lincoln  Lab  grammars,  numbers  come  in  the  general  form  "one  hundred 


twenty  four"  but  occur  rarely. 


54 


The  largest  syntactic  branching  factor  in  the  table  belongs  to  VPNS.  This  task 
has  no  syntactic  constraints,  uses  the  vocabulary  of  VP  and  attempts  to  recognize 
sentences  from  VP.  This  configuration  recognizes  30.87.  of  the  words  and  62.07  of  the 
sentences. 

5.3.  Search  Space  Reduction 

One  could  compute  the  search  space  size  given  this  new  branching  factor  in  the 
same  manner  as  was  done  in  chapter  4,  A more  revealing  number  is  the  reduction  in 
search  space  size.  We  define  the  search  space  reduction  ratio  for  a given  branching 
factor  B as  the  log  of  (vocabulary  size/B)^^'^®'^^3®  sentence  length)  ^ fable  of  search 
space  reduction  ratios  for  the  tasks  investigated  is  given  in  Figure  5-4.  The  column 
labeled  VOC  is  the  search  space  reduction  for  B=effective  vocabulary  size.  The  column 
labeled  SYN  is  for  B*=dynamic  branching  factor  and  the  third  column  is  for  B»effective 
dynamic  branching  factor  computed  in  this  chapter.  This  is  the  total  reduction 
including  vocabulary  and  syntax.  The  sum  of  the  first  two  columns  is  not  equal  to  the 
third  column.  This  is  to  be  expected  since  the  two  interact.  In  all  cases,  the  vocabulary 
restriction  is  greater  than  the  syntactic  reduction.  The  vocabulary  provides  much 
more  constraint  than  the  syntax  for  the  first  three  tasks.  For  the  last  three  tasks,  the 
ones  with  large  vocabularies,  the  syntax  provides  much  more  restriction. 


log [SEARCH  SPACE  REDUCTION  RATIOS] 


TASK 

VOC 

SYN 

TOTAL 

CHESS 

33.15 

14.29 

3B.79 

LIZ 

20.97 

5.27 

21.54 

VP 

3G.53 

14.57 

39.88 

IBd 

54.  BG 

41.20 

G3.38 

LLRAS 

49.  GG 

35. 2G 

57.28 

LLEXT 

B1.25 

38.75 

73.81 

VOC  - Vocabulary  Alone 

SYN  - Syntax  Alone 

TOTAL  - Total  reduction 


Figure  5-4.  Search  Space  Reduction  Ratios. 


56 


6.  COMPLEXITY  IN  CONNECTED  SPEECH  > A General  Model 

A best  behavior  model  for  the  analysis  of  ambiguity  in  connected  speech  was 
exhibited  in  chapter  5.  In  this  chapter  the  limitations  of  that  model  are  discussed. 
Then,  a general  model  for  complexity  in  connected  speech  is  developed.  This  model 
represents  “worst"  behavior  in  the  sense  that  it  attempts  to  predict  the  ambiguity 
faced  in  an  errorful  recognition. 

The  major  limitation  of  the  restricted  model  developed  in  chapter  5 is  that  it 
assumes  that  almost  all  the  recognition  proceeds  without  error.  That  is,  the  process 
moves  from  one  correct  state,  say  p,  to  another  correct  state;  the  ambiguity 
encountered  being  a fu.nction  solely  of  the  words  which  may  follow  state  p.  The 
consequences  of  this  assumption  are  that  the  unit  of  time  depends  upon  the  choice  of 
the  representation  of  the  lexicon  and  syntactic  network.  For  the  purposes  of  the 
previous  chapter,  the  word  was  chosen  as  the  fundamental  network  path  length.  The 
choice  could  just  as  well  have  been  syllables:  the  model  applies  in  this  case  also. 
Another  consequence  is  that  boundaries  are  assumed  to  be  detected  without  error. 
Experience  with  speech  understanding  systems  indicates  that  nothing  is  ever  certain. 
In  particular,  there  is  an  uncertainty  about  which  state  is  the  "correct"  state.  This 
uncertainty  means  that  the  number  of  words  which  may  appear  next  in  the  speech  is 
greater  than  that  given  by  a single  sub-vocabulary,  as  in  the  previous  model. 

6.1.  Ambiguity  in  Connected  Speech 

It  will  be  worthwhile,  at  this  point,  to  consider  various  situations  in  which  the 
correct  state  becomes  uncertain.  The  obvious  way  is  when  the  sub-vocabulary  of  a 


57 


state  contains  two  (or  more)  ambiguous  words.  For  example,  "and"  and  "ant"  or  "we" 
and  "the".  SImitarity,  a phrasal  ambiguity,  such  as  "into"  being  confused  with  "in  two", 
may  lead  to  an  incorrect  state.  The  obverse  of  phrasal  ambiguity  is  when  a word  is 
ambiguous  with  an  initial  substr  ng  of  another  word.  This  may  happen  for  "in"  and 
"into"  or  "an"  and  "and",  for  example.  This  case  differs  from  the  previous  two  cases  in 
that  the  incorrect  slate  may  now  be  within  a word.  Such  errors  may  have  already 
occurred,  in  which  case,  an  incorrect  state  leads  to  another  incorrect  state.  Suppose 
that  the  first  three  syllables  of  "accumulate"  had  been  matched  with  seme  phonetically 
similar  acoustic  sequence.  In  this  case,  the  last  syllable,  “-late",  may  be  confused  with 
any  syllable  of  another  word,  say  “late-ness".  All  of  these  types  of  errors  may  occur 
when  recognizing  connected  speech;  and,  as  shown  in  the  last  example,  they  may 
compound.  Some  examples  from  the  Marpy  Recognition  System[Lowerre,  1976]  will 
show  the  Kind  of  errors  that  may  occur. 

Correct;  Gamma  becomes  negate  epsilon. 

Recognized:  In  mod  becomes  negate  epsilon. 

Correct:  What  is  (s)even  plus  eighh 

Recognized;  One  is  one  plus  beta. 

6.2.  General  Ambiguity  Model 

The  approach,  as  it  differs  form  the  previous  chapter,  will  be  to  consider  the 
opposite  end  of  the  spectrum,  ie.  "worst"  case  behavior.  And  then  to  consider 
modifications  to  the  model  which  approach  more  nearly  the  actual  situation.  We  want 
to  measure  the  ambiguity  given  that  the  correct  state  is  not  known  with  certainty.  Let 
p be  the  correct  state  and  e be  a state  which  may  have  been  reached  because  of 
recognition  errors.  Now,  let  x be  a path  leaving  state  p and  y be  a path  leaving  state 
e.  If  X is  the  path  which  should  have  been  followed  and  x and  y are  phonetically 


58 


similar  paths,  the  uncertainty  concerning  Iho  “correti"  state  (Biven  that  * is  the  correct 
path)  will  be  reduced  very  liltle.  It,  on  Ihe  olher  hand,  all  paths  leading  from  possible 
error  states  are  dissimilar  phonelically  from  x,  the  uncertainty  about  the  correct  path 
will  be  greatly  reduced.  Thus,  what  we  want  lo  measure  is  Ihe  ambiguity  of  all  paths 
leading  from  possible  error  states;  Keeping  in  mind  that  an  error  state  may  be  in  the 
middle  of  a word. 

Let  p be  Ihe  correci  state,  The  absolute  worst  case  would  be  where  every 
other  state  was  a possible  (error)  state  and  for  every  path  leaving  state  p there  was 
an  identical  path  leaving  every  other  state.  If  all  paths  from  all  states  are  identical, 
the  next  word  of  the  utterance  gives  no  information  about  which  state  is  in  error. 
Thus,  if  all  stales  were  equally  probable  before  recognition  of  the  word,  they  would 
also  be  equilly  probable  afer  recognition.  The  uncertainty  or  ambiguity  under  these 
conditions  would  be  the  log  of  the  number  of  states.  Clearly,  alln  ving  all  states  as 
being  possible  is  unrealistic  and  would  involve  a great  d''al  of  computation.  This 
problem  can  be  rectified  by  application  of  the  following  heuristic  Knowledge;  the 
difference  in  the  number  of  syllables  in  a mistaKen  recognition  and  the  correct 
sentence  is  generally  2 or  less;  and  furthermore,  at  any  point  of  recognition,  the 
number  of  syllables,  counting  from  the  utterance  beginningfor  ending)  usually  differs 
from  the  misrecognized  number  of  syllables  by  at  most  two.  Using  syllables  as  the  unit 
of  time,  the  number  of  paths  which  must  be  matched  may  be  pruned  in  the  following 
manner.  Assume  each  path  in  the  networK  represents  a syllable.  Let  t be  the  number 
of  syllables  which  have  been  considered  by  the  recognizer.  For  each  (syllable)  time  t, 
the  set  of  allowed,  correct  states  may  be  determined  in  a manner  analogous  to 
computing  state  probabilities  in  chapter  4.  Formally,  let  T(t)  be  the  set  containing 
states  for  which  there  exists  a path  of  length  t from  the  initial  state.  Then, 


59 


T(0)  ” { inilial  stale  ) 

T(t)  - { s I 3 r such  that  r(PREV(s)  and  r(T(f-l)  ) 

Cor,;parison  of  this  with  the  computation  of  p(s|t)  in  chapter  4 should  convince 
the  reader  that  an  equivalent  definition  for  T(t)  is 
T(t)  » { s 1 p(s|l)>0  } 

Define  S(t)  to  be  all  states  in  T(t)  and  T(t  + 1).  Call  the  set  of  paths 

leading  from  states  in  SO)  the  super-vocabulary  (at  time  t).  What  is  to  be  determined 
is  the  ambiguity  of  super-vocabularies  averaged  over  all  states  and  times.  This  is 
done  by  computing  the  effective  vocabulary  size  for  each  super-vocabulary  and  using 
this  as  the  branching  factor  in  {he  equations  of  chapter  4. 

6.3.  Discussion  of  Results 

The  results  of  the  compulation  of  "worst”  case  branching  factor  are  shown  in 
figure  6-1.  For  reference,  the  "best"  case  bran«.hing  factor  is  also  shown.  This 
computation  was  done  for  the  Chess,  lizard  and  voice  programming  tasks  The  larger 
tasks  require  so  much  computation  as  lo  be  unfeasible  at  the  present  time.  As 
expected,  the  "worst"  case  branching  factor  is  the  larger  of  the  two  for  all  cases.  For 
the  Lizard  task  the  "worst"  case  branching  factor  is  approximately  twice  that  of  the 
"best"  case.  This  would  indic^'le  that  a recognizer  wliich  has  made  an  error  requires 
twice  as  much  information  in  order  that  it  return  to  the  correct  path.  The  branching 
factor  for  the  Voice  Programming  task  has  increase  from  i 3 to  7.6.  Most  of  the 
sentences  in  this  language  contain  number  of  the  form  <DiGiT>  <DIGIT>  <DIGIT>  ... 
<D1GIT>.  For  example  "Store  one  one  one"  If,  at  some  point  in  the  recognition,  there 


60 


BRANCHING 

FACTORS 

Restricted 

General 

TASK 

Mode  1 

Model 

Lizard 

1.4B 

3.17 

Voice  Prog. 

1.28 

7.B3 

Chess 

1.09 

3.92 

Figure  B-1.  Comparison  of  "best"  and  "uorst"  case 

branching  factors. 


UOROS  SYLLABLES 


TASK 

Number 

BF 

Number 

BF 

Lizard 

17 

1.4B 

25 

1.50 

Voice  Prog. 

37 

1.28 

52 

1.53 

Chess 

25 

1.09 

33 

1.11 

Figure  G-2.  Comparison  of  "best"  case  analysis  for 

uords  and  syl lables. 


t 


61 


is  some  uncertainty  at  to  whether  n or  n+1  syllables  have  been  seen,  the  next  step  of 
recognition  provides  very  little  help  in  shifting  the  recognition  toward  the  correct 
path;  in  this  case,  the  ambiguity  is  high,  In  fact  in  this  particular  case,  the  system 
would  not  recover  withc'i*  <_  nantic  of  other  higher  level  knowledge.  Using  syllables 
as  the  unit  of  time  means  that  shorter  phone  sirings  are  matched  at  each  time  interval. 
Shorter  strings,  in  general,  imply  greater  ambiguity.  This  effect  must  be  considered 
when  comparing  the  "best"  and  "worst"  case  results.  In  figure  6-2,  the  effective 
vocabulary  size  for  the  syllable  vocabularies  has  been  added.  The  results  show  the 
increase  due  to  using  syllables  is  small  relative  to  the  increase  due  to  the  "worst" 
case. 


nPlifiiirtiiiiiifHii. 


62 


7.  RESULTS  OF  LANGUAGE  ANALYSIS 

In  this  chapter  we  present  and  discuss  the  results  of  language  analysis.  The 
results  are  summarized  in  Figure  7-1.  The  first  two  columns  of  this  table  give  the  task 
name  and  the  number  of  words  in  the  vocabulary  of  the  task.  The  columns  to  the  right 
contain  branching  factors  under  various  co  ditions.  The  effective  branching  factor  for 
the  vocabulary,  without  the  effects  of  syntactic  restriction,  is  shown  the  column 
labeled  "vocabulary  only  . It  is  the  same  as  the  effective  vocabulary  size  described  in 
chapter  3 and  represents  the  average  number  of  words  retrieved  in  a lexical  match 
per  word  spoken.  Thus,  for  the  10  digits,  1.19  words  would  appear,  on  the  average, 
for  each  word  spoken.  The  column  marked  "grammar  only"  gi''^s  the  average 
branching  factor  considering  syntax,  but  disregarding  the  effects  of  lexical  ambiguity. 
This  branching  factor,  described  in  chapter  5,  tepresents  the  average  fan-out  of  the 
syntax;  or,  the  average  number  of  words  which  may  follow  another  word  in  an 
utterance.  This  column  is  the  same  as  the  vocabulary  size  for  the  first  four  tasks  since 
any  word  may  follow  «ny  other  word.  For  the  tasks  with  syntactic  constraints,  this 
branching  factor  ranges  from  7.32  words  to  20.28  words.  The  last  column  contains  the 
effective  branching  factor  for  the  combined  effects  of  lexical  ambiguity  and  syntactic 
constraint  discussed  in  chapter  6.  A brief  description  of  each  of  the  tasks  proceeds  a 
detailed  discussion  of  the  results. 

7.1.  DESCRIPTION  OF  THE  TASKS 

Appendix  C contains  descriptions  of  the  languages  analyzed  in  this  thesis.  Each 
descriptiof.  consists  of  a definition  of  the  syntax  of  the  language  and  a dictionary  for 


63 


BRANCHING  FACTORS 


Number  of  Vocabulary 


Uords  in 

Vocabu 1 ary 

Grammar 

and 

Task 

Vocabulary 

Only 

Only 

Grammar 

PHONES 

33 

20.10 

33 

20.10 

DIGITS 

10 

1.13 

10 

1.13 

ALPHABET 

2S 

3.87 

26 

3.87 

ALPHA -DIG 

36 

3.41 

36 

3.41 

CHESS 

25 

1.46 

7.36 

1.03 

L i nco 1 n Labs 

Basic 

236 

2.43 

3.15 

1.20 

Extended 

1 AlO 

3.54 

20.28 

1.34 

IBN 

250 

2.31 

7.32 

1.03 

LIZARD 

17 

1.55 

3.32 

1.46 

VP 

37 

1.70 

10.82 

1.28 

VPNS 

37 

1.70 

37.00 

1.70 

Figure  7-1.  Results  of  Language  Analysis. 


64 


it’s  vocabulary.  Dictionaries  give  the  allowed  pronunciations  for  each  word  in  the 
vocabulary.  The  first  four  tasks  are  not  t'-uely  languages,  but  are  sets  of  words  we 
wished  to  analyze.  They  have  been  given  a simple  syntactic  description  which  allows 
any  word  to  follow  any  other  word. 

PHONS:  is  a language  consisting  of  a set  of  33  phones.  Describing  the  phones  as  a 
language  makes  possible  the  same  analysis  as  for  any  other  vocabulary.  This 
means  we  can  calculate  the  effective  vocabulary  size  for  the  phones. 

DIGITS:  This  vocabulary  is  the  10  digits.  It  was  included  because  it  was  one  of  the 
first  vocabularies  used  in  speech  recognition.  It  is  still  used,  although  usually 
for  comparative  purposes. 

ALPHABET;  This  vocabulary  is  the  spoken  letters  of  the  alphabet.  It  is  highly 
ambiguous  phonetically  and  is  therefore  a good  test  case. 

ALPHA-DIGIT:  Is  the  combination  of  the  10  digits  and  the  26  letters.  Having  this 
vocabulary  allows  one  to  evaluate  the  effect  of  combining  two  vocabularies. 

CHESS:  The  original  Hearsay  1 chess  task  language.  It  has  a vocabulary  of  25  words. 

LIZARD;  Lizard  is  a small  voice  programming  language  with  a vocabulary  of  17  words. 
It  has  been  used  with  the  HARPY  speech  recognition  system. 

VP;  This  language  is  also  a voice  programming  language.  It  has  been  used  by  both  the 
Hearsay  I system  and  the  HARPY  system.  It  is  richer  in  it’s  syntax  than  Lizard 
and  contains  37  words.  This  language  has  been  used  extensively  as  a test  case 
for  the  HARPY  Speech  Understanding  System  in  a mode  where  any  word  can 
follow  any  other  wordi  l.e.  there  is  no  syntactic  support.  The  results  for  this 


65 


configuration  of  the  language  are  shown  m the  last  line  of  the  table  under 
VPNS. 

IBM;  This  IS  the  IBM  "New  Raleigh"  Language.  It  describes  syntactically  correct 
English-like  sentences  with  little  or  no  semantic  interpretation. 

LL0AS:  A language  developed  by  Lincoln  Labs  for  use  with  their  speech  recognition 
system.  It’s  task  is  displaying  and  controlling  acoustic  data.  There  are  236 
words  in  it's  vocabulary. 

LLEXT:  An  "extended”  version  of  LLBAS  containing  410  words. 

7 2.  DISCUSSION  OF  RESULTS 

We  will  first  consider  the  tasks  in  order  and  ’.hen  general  aspects  of  the 
complete  table.  Recall  that  the  phone  iask  vocabulary  was  just  the  set  of  phones.  The 
effective  vocabulary  sire  obtained  is  20.  This  means  that  every  phone,  on  the 
average,  matches  uniformly  to  20  phonetic  labels.  It  must  be  remembered  that  this  is 
for  isolated  phones  without  syntactic  support,  or  even  a surrounding  lexical  context. 
Even  so,  this  value  seems  rather  high.  This  quantity  has  been  computed  from  actual 
counts  from  the  BBN  speech  recognition  system  [Makhoul,  1975}  The  value  for  their 
system,  which  uses  67  different  phoneme  types  and  83  acoustic  classifications,  is  4 
labels/segment.  If  this  figure  were  used  as  a standard,  it  says  that  the  computation  of 
H(A/B)  is  roughly  two  and  one-half  times  larger  than  it  should  be.  If  anything,  this 
implies  that  our  model  accounts  for  more  variability  in  the  phones  than  is  really  there; 
that  is,  it  is  L .sed  away  from  high  quality,  well  articulated  speech.  We  intended  this 
to  be  I he  nature  of  the  system.  Also,  bear  in  mind  that  the  models  were  designed  for 


relative  comparisons. 


66 


For  the  10  digits,  the  effective  vocabulary  size  is  1.19.  The  interpretation  here 
is  that  six  words  will  be  retrieved  for  every  five  words  spoken  and  one  of  them  is 
obviously  wrong.  This  corresponds,  roughly,  to  a recogniticn  rate  of  837,.  Currently, 
speech  recognition  systems  have  very  little  trouble  recognizing  the  digits  spoken  in 
isolation.  Again,  we  see  that  if  the  model  is  biased,  it  is  biased  toward  greater 
variability.  Wo  feel  that  this  is  actually  an  advantage  of  the  model;  for,  given  the 
relative  soundness  of  the  model,  the  differences  between  vocabularies  are  enhanced. 

The  spoken  alphabet  exhibits  an  effective  vocabulary  size  of  3.87  words.  This 
is  reasonable,  particularly  when  compared  to  the  digits,  since  the  spoken  alphabet  is 
highly  ambiguous. 

In  the  alphabet-digit  vocabulary  wo  see  the  effects  of  averaging.  Assuming 
equally  probable  choices  from  the  36  words,  a vocabulary  with  an  approximate 
recognition  rate  of  807.  is  combined  with  one  whose  rate  is  267.  in  the  ratios  10/36 
and  26/36  respectively.  This  gives  approximately  407.  recognition  which  is  roughly 
equivalent  to  a branching  factor  of  2.5.  This  method  of  combining  branching  factors  is 
an  approximation,  valid  only  when  the  recognition  rates  are  nearly  1007.  (effective 
branching  factor  of  1)  and  there  is  no  inter-vocabulary  ambiguity.  There  are  inter- 
vocabulary ambiguities;  "two"  and  "u"  or  "three"  and  "g",  for  instance.  This  would 
account  for  the  effective  branching  factor  being  greater  than  predicted  from  the 
independent  results  for  the  two  vocabularies. 

Consider  now  the  tasks  having  syntactic  restriction.  The  number  of  words  in 
their  vocabularies  range  from  17  to  140  while  the  effective  sizes  range  between  1.46 
and  3.54.  They  seem  to  be  directly  related.  This  relative  ordering  Is  expected  since 
large  vocabularies  have  greater  potential  for  ambiguity  and  therefore  would,  in 


67 


general,  have  larger  effective  sizes.  An  intero-ling  comparison  can  be  made  between 
the  chess  vocabulary  and  the  I .•ard  vocabulary,  In  this  case,  the  17  words  of  the 
Lizard  vocabulary  have  slightly  higher  confusion  than  the  25  words  of  the  chess  task. 

One  reason  for  the  large  effective  vocabulary  size  of  the  Lincoln  Labs  Basic  task 
(9.15)  is  the  fact  that  it  contains  words  pairs  which  are  almost  identical;  such  as,  "to"- 
"two",  ''recompute''-"recomputed"  and  "spectra"-"spectrum"  This  points  to  a difficulty 
in  the  representation  of  a vocabulary.  Namely,  when  should  two  words  be  considered 
separate  entities.  In  the  "two-to“  case,  syntax  would  probablely  disambiguate  them 
and  the  analysis  procedures  would  treat  them  separately  when  considering  syntax.  If 
spectra  and  spectrum  appear  within  the  same  context  and  are  functionally 
differentiated,  they  must  remain  as  two  distinct  words.  On  the  other  hand,  if  they 
describe  the  same  semantic  notions,  then  the  ambiguity  is  not  one  of  real  concern. 

The  branching  factors  for  the  "grammar  only"  case  fall  into  the  range  7.32  to 
20.28.  We  see  that  syntactic  restriction  alone  has  nearly  coualized  the  difficultly  of 
the  Chess,  Lincoln  Labs  Basic  and  IBM  languages  Lizard  and  VP  have  larger  branching 
factors,  even  though  they  have  fewer  words  in  their  vocabularies.  The  Lincoln  Labs 
extended  task  has  the  largest  branching  factor  of  syntactically  constrained  languages. 

Each  of  the  languages,  except  IBM’s,  contain  the  numbers  in  one  form  or 
another.  In  the  Chess  task,  the  numbers  are  all  single  digits  indicating  rank  or  file.  In 
Lizard  and  VP  numbers  are  sequences  of  digits  of  indefinite  length.  They  occur  in 
every  sentence;  in  Lizard,  this  accounts  for  the  branching  factor  being  near  ten.  In  VP 
numbers  occur  in  approximately  757  of  the  sentences. 

In  the  Lincoln  Lab  grammars,  numbers  come  in  the  general  form  "one  hundred 


twenty  four"  but  occu'  rarely. 


68 

The  largest  syntactic  branching  factor  in  the  table  belongs  to  VPNS.  This  task 
has  no  syntactic  constraints,  uses  the  vocabulary  of  VP  and  attempts  to  recognize 
sentences  from  VP.  This  cor\figuration  recognizes  90.87.  of  the  words  and  h2.0t  of  the 
sentences[Lowerre,  1976]. 


69 


8.  CONCLUSION 

\ 

This  dissert afion  describes  a general  model  for  the  analysis  of  languages  for 
man-machine  communication.  It  the  first  known  study  of  ambiguity  at  all  levels  of 
recognition  and  represents  the  best  analytical  tool  \^e  -have,  to  date,  for  the  design  of 
languages.  >Tius^c^  presents  a summary  of  the  results  and  indicates  directions  for 

future  research. 


8.1.  Contribution! 


8. 1. 1 The  OveraiywodeL 

The  model  unifies  the  concepts  of  ambiguity  and  restriction/,-Th«  is>done  by 


expressing  each  as  a branching  facto>j^  notion  which  is  easily  understood  and 
visualized.  Ambiguity  increases  the  branching  factor  while  restriction  reduces  it. 
Using  branching  factor  has  the  advantage  that  an  effective  search  space  size  may  be 


computed  for  any  language.  Further,  since  ambiguity  and  syntactic  restriction  are 
expressed  in  a uniform  way,  the  effect  of  one  with  respect  to  the  other  may  be 
evaluated  by  considering  search  space  reduction  ratios. 


The  model  is  useful  for  comparing  the  relative  complexities  faced  by  speech 
understanding  systems.  Effective  vocabulary  size  provides  a way  of  measuring  the 
complexity  in  isolated  word  recognition  while  effective  search  space  size  measures 
language  complexity.  Thus,  the  performance  of  systems  may  be  contrasted  by 
using  these  measures;  /reviously,  this  could  be  done  only  if  thd^wo  systems  had 


been  tested  using  the  same  data;  a situation  which  occured  rarely. 


70 


Analyzing  and  anticipating  the  ambiguities  encountered  in  a specific  language  is 
useful  for  language  design  and  benchmarking.  Language  design  is  discussed  in  the 
section  on  future  work.  Benchmarking  means  deciding  whether  the  expected 
performance  of  a given  task  is  being  achieved.  If  it  is  not,  examination  of  the  errors 
which  occurred  and  were  not  predicted  by  the  model  may  point  out  flaws  In  the 
system  which  had  gone  unnoticed;  and  vice  versa. 

8.1.2  Phonetic  Ambiguity 

The  model  uses  phone -to-phone  distance  measures  as  a basis  for  subsequent 
analysis.  We  have  indicated  several  ways  these  measures  might  be  obtained.  The 
choice  of  which  one  to  use  will  depend  on  what  is  to  be  modeled  and  what  type  of 
data  is  available  to  the  user.  Actual  coui^ts  may  be  used,  provided  they  are 
trustworthy.  Data  may  be  obtained  from  either  human  perceptual  or  machine 
recognition  studies  may  be  used.  We  have  shown  how  metrics  on  the  acoustic  space 
can  be  used.  One  of  these,  the  Itakura  metric,  has  been  used  as  a basis  for  the 
analysis  presented  in  this  thesis.  Another  method  of  obtaining  these  measures  is  by 
using  a theoretical  model.  We  have  presented  one  theoretical  model,  an  articulatory 
model.  The  performance  of  this  model  is  not  as  good  as  we  had  hoped.  It  appears 
that  the  phones  must  be  described  in  finer  detail  in  order  to  accurately  capture  their 
relative  differences.  We  intend  to  improve  our  model  and  also  will  look  for  other  work 
along  these  lines. 

8.1.3  Lexical  Ambiguity 

In  computing  lexical  ambiguity  we  developed  a phone  sequence  matching 
algorithm  which  is  easily  extendible  to  phrases.  Effective  vocabulary  size  was  shown 


71 


to  be  a valid  measure  of  the  irherent  complexity  of  a vocabulary.  Information 
theoretic  concepts  proved  useful  in  this  analysis.  We  feel  they  are  applicable  in  many 
other  areas  of  speech  understanding  systems. 

8.1.4  Syntactic  Restriction 

We  have  exhibited  a useful  way  of  viewing  syntactic  restriction,  i.e.  dynamic 
branching  factor.  This  measure  of  complexity  is  compatible  with  the  measure  of 
vocabulary  complexity.  The  notion  of  branching  factor  has  been  used  in  other  areas  of 
computer  science.  When  applied  in  a straight  forward  way  to  measure  syntactic 
ambiguity,  it  is  very  revealing.  We  have  seen  task  descriptions  which  list  the  number 
of  non-terminals  and  rules  of  the  grammar;  they  should  also  list  the  average  branching 

factor. 

8.1.5  Language  Analysis 

Two  models  for  ambiguity  In  connected  speech  were  presented:  a "best" 
behavior  model  and  a "worst"  behavior  model.  Both  models  combine  the  effects  of 
lexical  ambiguity  and  syntactic  restriction.  The  "best"  behavior  model  measures  the 
ambiguity  encountered  when  most  of  the  recognition  proceeds  without  error.  The 
"worst"  behavior  model  measures  the  ambiguity  faced  by  an  error-prone  system.  In 
effect,  it  indicates  the  difficulty  of  returning  to  the  correct  path  given  that  the 
recognition  has  taken  a wrong  path. 

8.1.6  The  Tasks 

The  Chess  task[Reddy,  et  al.,1972:  Baker,  1975;  Lowerre,  1976]  has  an  effective 
search  space  size  of  23.31.  Its  equivalent  vocabulary  size  of  1.4$  is  the  lowest  of  all 


72 


the  tasks  studied.  The  effective  branching  factor  for  this  task  is  1.09}  also  the  lowest 
and  the  same  as  for  the  IBM  task. 

The  Lincoln  Labs  "extended"  task  [Torgie,  et  al.,  1974]  has  the  largest  search 
space  size,  38.79  It  is  the  most  difficult  task  by  all  measures  except  effective 
branching  factor}  Lizard  and  VH\IS  having  larger  effective  branching  factors.  The 
"basic"  task,  even  though  its  vocabulary  contains  236  words,  is  of  roughly  the  same 
difficulty  as  the  voice  programming  task  when  considering  syntactic  and  effective 
branching  factors, 

The  IBM  "New  Raleigh"  task  [Tapper!,  1975;  Baker  and  Bahl,  1975]  has  an 
effective  search  space  size  of  23.23.  Its  effective  branching  factor  is  1.09,  the  same 
as  for  the  chess  fask.  The  syntactic  branching  factor  for  this  task  is  7.32,  lowest  of  all 
the  tasks. 

For  the  Lizard  task[Lowerre,  1976],  the  search  space  size,  19.56,  is  the  smallest 
of  all  the  tasks.  Its  effective  branching  factor  of  1.46,  however,  is  the  largest  of  the 
languages  having  syntactic  constraints. 

The  voice  programming  task,  VP  [Erman,  1974;  Baker,  1975;  Lowerre,  1976],  has 
an  effective  search  space  size  of  28.24.  This  task  has  the  largest  syntactic  branching 
factor  of  the  medium  sized  languages.  VP  with  no  syntax  has  the  highest  syntactic 
branching  factor. 

The  important  contribution  of  this  thesis  is  that  it  provides  a way  to 
characterize  the  relative  difficulties  and  accomplishments  of  different  speech 
understanding  systems.  Vocabulary  size  is  not  a good  measure  of  lexical  complexity; 
some  other  measure  of  vocabulary  size,  normalized  for  relative  ambiguity  would  be 


73 


better.  The  number  of  production  rules  is  not  a useful  measure  of  grammatical 
complexity.  In  fact,  quite  the  opposite  may  be  true;  moie  rules  imply  more  constraint. 
Some  other  measure,  such  as  the  average  number  of  alternatives  at  each  choice  point 
would  be  better.  Investigators  in  the  area  of  speech  underst.mding  should  reference 
their  results  to  some  standard.  This  thesis  presents  some  useful  measures, 

8.2.  Directions  for  Future  Research 

With  the  generation  of  any  large  system,  particularly  in  a new  area,  many  new 
ideas  for  improvements  are  spawned  and  many  inviting  avenues  are  left  unexplored. 
This  investigation  was  no  exception.  Possible  improvements  to  the  model  are  outlined 
below. 

1.  Improvement  of  the  theoretical  phonetic  ambiguity  model  will  be  necessary 
in  order  for  it  to  be  used  a basis  for  the  lexical  and  phrasal  model.  Until 
such  time,  the  acoustic  similarity  metrics  described  should  be  adequate. 

2.  Although  the  model  provides  a particular  solution  to  the  juncture  ambiguity 
problem,  more  detailed  used  of  phonological  rules  should  lead  to  a more 
precise  model. 

3.  Analysis  of  the  ambiguities  encountered  in  segmentation  and  their 
implications  for  phonetic  ambiguity  should  lead  to  a better  model. 

A.  The  model  assumes  that  Context-Free  languages,  as  used  in  speech 
understanding  systems,  can  be  represented,  for  all  practical  purposes,  by  a 
finite  state  approximation.  In  doing  this,  some  small  amount  of  restrictive 
power  may  be  lost.  While  this  is  not  considered  a serious  problem,  further 
investigation  into  the  nature  of  its  effects  should  be  considered. 


74 


5.  Semantic  ambiguity  happens  when  two  sentences  are  phonetically  similar 
enough  that  one  may  be  recognized  as  the  other  (or  they  may  bo  both 
recognized,  with  a match  score  for  each,  by  some  systems)  and  the  two 
sentences  cannot  be  disambiguated  by  semantics.  Conversely,  semantics 
may  apply  constraints  to  the  vocabulary  and  syntax  which  would  r ninate 
ambiguous  sentences  from  being  considered.  The  notion  of  branching 
factor  accommodates  either  viewpoint.  Analysis  should  be  done  at  this 
level  also,  although  we  have  no  specific  ideas  about  how  it  could  be  done. 
It  should  be  investigated  to  whatever  extent  possible. 

8.3.  Implicalioni  (or  Language  Design 

Given  that  a reasonable  analytical  tool  is  available,  a fruitful  area  for  future 
research  is  the  design  of  languages  for  man^machine  communication.  Designing 
languages  would  include,  but  not  necessarily  be  limited  to,  the  following  possibilities: 

1.  Reducing  the  ambiguity  of  a language  by  altering  the  vocabulary  and  syntax 
of  the  language  or  by  redefining  the  task.  Sometimes  alteration  of  the 
vocabulary  and  syntax  may  be  hindered  by  standard  or  accepted  usage. 
This  would  be  true  of  the  numbers  and  the  chess  task.  At  other  times, 
there  are  free  choices;  as  with  the  names  "ALPHA",  "BETA",  "GAMMA", 
"DELTA",  "EPSILON"  in  the  voice  programming  language. 

2.  Tailoring  a task  and  language  to  some  predefined  constraints.  For  instance, 
it  would  be  desirable  to  know  just  hew  much  ambiguity  could  be  tolerated 
by  a system  whose  processor  was  a mini-computer  with  restricted  memory 
and  fixed  instruction  time.  This  aspect  will  become  increasingly  important 


75 

as  the  uss  of  speech  understanding  systems  grows  and  new  tasks  are 
undertaken. 

’n  order  to  do  design  of  languages,  one  must  understand  the  ambiguities 
involved.  The  results  of  the  analysis  presen*ed  in  this  dissertation  provide  this 


information. 


76 


REFERENCES 


Bahl,  L.R.,  J.K.  Baker,  PS.  Cohen,  N.R.  Dixon,  F.  Jelinek,  R.L.  Mercer,  and  H.F. 

Silverman  (1976),  "Preliminary  Results  on  the  Performance  of  a System 
for  the  Automatic  Recognition  of  Continuous  Speech",  Proceedings  of 
the  IEEE  Internalionsl  Conference  on  Acoustics,  Speech,  and  Signal 
Processing,  Philadelphia,  April  1976,  425  429. 

Baker,  J.K.  (1975),  Stochastic  Modeling  as  a Means  of  Automatic  Speech 
Recognition,  Ph.D.  Thesis,  Computer  Science  Department,  Carnegie- 
Mellon  University,  Pittsburgh,  Pennsylvania,  April  1975. 

Baker,  J.K.  and  L.R.  Bahl  (1975),  "Some  Experiments  in  Aulomatic  Recognition  of 
Continuous  Speech",  Proceedings  IEEE  Computer  Conference  75, 
September  1975,326-329 

Erman,  LO.  (1974),  An  Environment  and  System  for  Machine  Understanding  of 
Connec'.d  Speech,  Ph.D.  Thesis,  Computer  Science  Department, 
Carnegie-Mellon  University,  Pittsburgh,  Pennsylvania,  May  1974. 

Feldman,  J.  and  D.  Cries  (1968),  "Translator  Writing  Systems",  Communications  of  the 
ACM  11,  2 (February),  1968,  77-113. 

Forgie,  J.W.,  and  C.D.  Forgie  (1959),  "Results  Obtained  From  a Vowel  Recognition 
Computer  Program",  Journal  of  the  Acoustical  Society  of  America,  31, 
1480-1489,  November  1959. 

Foi  “,  J.W.  et  al.  (1974),  Speech  Understanding  Systems;  Semi-Annual  Report, 
Lincoln  Labs,  MIT,  Lexington,  ffass..  May  1974. 

Fu,  K.S.  and  T.  Li  (1969),  "On  Stochastic  Automata  and  Languages",  Information 
Sciences,  Vol.  1,  403-420,  1969. 

Goldman,  Stanford  (1953),  Information  Theory,  Prenticc-Hall,  New  York,  1953. 

Itakura,  F.  (1975),  "Minimum  Prediction  Residual  Principle  Applied  to  Speech 
Recognition",  IEEE  Transactions  on  Acoustics,  Speech,  and  Signal 
Processing,  23,  February  1975,  67-71. 

Ladefoged,  P.  and  D.E.  Broadbent  (1970),  "Information  Conveyed  by  Vowels", 
Journal  of  the  Acoustical  Society  of  America,  29,  98-104,  January  1957. 


>4 


77 


Lowerre,  B.  (1976),  The  HARPY  Speech  Recognition  System,  Ph.D.  Thesis,  Carnegie- 
Mellon  University,  Pittsburgh,  Pennsylvania,  April  1976. 

Makhoul,  J.  (1975),  personal  communication. 

Miller,  G.A.  and  P.E.  Nicely  (1955),  "An  Analysis  of  Perceptual  Confusions  Among 
Some  English  Consonants",  Journal  of  th«  Acoustical  Society  of  America, 
27,  338-352,  March  1955. 

Neveli,  A.,  J.  Barnett,  J.  Forgie,  C.  Green,  D.  Klatt,  J.C.R.  Licklider,  J.  Munson,  R. 

Reddy,  and  W.  Woods  (1971),  Speech  Understanding  Systems:  Final 
Report  of  a Study  Group,  Pub.  by  North  Holland  (1973). 

Newell,  A.  (1975),  "A  Tutorial  on  Speech  Understanding  Systems”,  Speech 
Recognition,  Reddy,  D.R.(Ed.),  Academic  Press,  New  York,  1975. 

Peterson,  G.E.  and  H.L.  Barney  (1952),  "Control  Methods  Used  in  a Study  of  the 
Vowels",  Journal  of  the  Acoustical  Society  of  America,  24,  175-184, 
March  1952. 

Reddy,  D.R.,  L.D.  Erman,  R.D.  Fennell,  and  R.B.  Neely  (1973),  "The  HEARSAY  Speech 
Understanding  System:  An  Example  of  the  Recognition  Process", 
Proceedings  of  the  3rd  International  Joint  Conference  on  Artificial 
Intelligence,  Stanfoid,  California,  185-193 

Tapper!,  C.C.  (1975),  "Experiments  with  a Tree-Searching  Method  for  Converting 
Noisy  Phonetic  Representation  into  Standard  Orthography",  IEEE 
Transactions  on  Acoustics,  Speech,  and  Signal  Processing,  23,  February 
1975. 

Unger,  S.  (1968),  "A  Global  Parser  for  Context-free  Phrase  Structured  Grammars", 
Communications  of  the  ACM  11,4  (April),  240-247. 

Woods,  W.A.  (1970),  "Transitions  Network  Grammars  for  Natural  Language  Analysis", 
Communications  of  the  ACM  13,  10  (October),  1970,  5? I -602. 


Appendix  A:  Phonetic  Ambiguity  - Itakure  Metric 


78 


A. I.  Itakuri  Mairic  Calculation 

The  Itakura  metric  [Itakura,  1975]  malches  an  inpu!  signal  with  stored  reference 
patterns  using  the  distance  function: 

d(X/a)  - c log[  (br)  / (aV)  ] 


where 


(xy)  means  the  inner  product  of  two  vectors, 

X is  a segment  of  the  time  signal  x(l),  x(2),  ...x(N), 

a - 1,  a(l),  i(2),  ...a(p)  are  the  t.PC  model  parameters  for  tne  reference  pattern, 
c - log[(aa)], 

b - 1,  b(l),  b(2), ...  b(p)  are  the  modified  LPC  coefficients  computed  from  a, 
r « r(0),  r(l),  ...r(p)  are  the  autocorrelation  coefficients  for  X, 
a’  is  the  vector  representing  the  LPC  model  for 

Reference  patterns  for  the  phones  used  in  this  thesis  are  given  in  section  A.2. 
Each  pattern  has  the  form: 

phone  c 

1.0  bU)  b(2)  b(3)  b{fJ) 

b(5)  b(6)  b(7)  b{8)  b(9) 

b(10)  b(ll)  b(12)  b(13)  b(H)  ; 

These  coefficients  were  derived  from  the  autocorrelation  coefficients 
given  in  A.3.  They  have  the  form: 

phone  1 


1.0 

aU) 

a(2) 

a(3) 

a(^) 

a(5) 

a(6) 

a(7) 

a(8) 

a{9) 

a(lO) 

aUl) 

a(12) 

aU3) 

a(14) 

The  distance,  d(X/a),  is  the  logarithm  of  the  conditional  probability,  p(X/a),  that 
the  input  signal  X was  generated  by  the  LPC  model  defined  by  a.  By  comparing  the 
autocorrelatio:  coefficients  witii  the  refr-ence  patterns,  the  conditional  probabilities 
o(pl/p2)  may  be  computed  for  each  pa.'  of  phones  pi  and  p2.  A matrix  of  these 
probabilities  is  shown  in  section  m.4.  This  tables  contains  probabilities  which  are 
normalized  so  that  p(x/x)-' 


79 

A. 2.  R«f«r«nc«  Patterns  tor  ItaKura  Metric 

.*653838 


. ieeo8ooei 

.9*81*53 

.*11280* 

.4116*20 

..575353 

. 1872581 

-.1*88818 

-.579*935 

-.382596  ■ 

-.3537391 

-.*9*1177 

-.3728067 

-.*878909 

-.2399*88 

-.49889636-1 

V 

.1572598el 

.188888881 

-.1468318a! 

.88801*4 

-.613893* 

.4*26651 

.18555*0 

-.6291837 

.5*98009 

-.3225522 

.2688221 

-.2878323 

.8631823a-l 

.28868*26-1 

-.2961992S-1 

.28352286-1 

N 

.1772191#! 

.leooooosi 

-.16*6*69#1 

.135282261 

-. 1812057S1 

.5329*55 

-.*72881* 

.481568* 

-.5*48*01 

.709t77C 

-.6561877 

.6832893 

-.4587166 

.2588628 

-. 189**28 

.46961696-1 

0 

.8561885 

.1800008#1 

-.8*15627 

.3819256 

-.2366808 

-.6365611 

-.6732695#-! 

.417793* 

-.155983* 

.455388* 

-.73753786-1 

-.2167171 

.**25386a-l 

-.1321879 

.5216835#-! 

.86357116-1 

PO 

.1872732#! 

.iooeeo8#i 

-.589293* 

.551*796 

.2578911 

-.5*8589* 

.118*381#! 

-.1568861 

.6362915 

.1875962 

-.3*19255 

.3963176 

-.932*6976-1 

.21597*4 

.1317259 

-.11279556-1 

K 

.132632*81 

.1808888#! 

.11685*3#! 

.186383261 

.3284955 

.1969825 

.*831753 

.2786819 

.**2*938 

-. 13063856-1 

-.57278356-1 

-.2758850 

-.1591132 

-.1137082 

-.233*5236-1 

-.1*226576-2 

C 

.617888* 

. looeoooel 

.6223**8 

.987*1116-2 

-.7767338 

- 18*598761 

-.5122625 

.**162026-2 

.2158609 

.36*5362 

.2136235 

.280688* 

-.18*3659 

-.55899376  1 

-.73207128-1 

-.57843286-1 

s 

.237662861 
. 180808861 

.173138761 

.121813961 

.6293892 

.22893*2 

-. 351*5516-1 

-.1976283 

-.26328*5 

-.270662* 

-.22*5*9* 

-.21*7891 

-.1886363 

-.1595308 

-.81822286-1 

-.29979836-1 

T 

.1585892#! 
. 188808861 

.1*58897#! 

.1369*7*61 

.7766711 

.712883* 

.3687056 

.1188773 

-.822*9596-1 

-.222**82 

-. 1568316 

-.1/7*207 

-.25*2178#-! 

-.21212*56-1 

.37956*76-1 

-.  1*789766-2 

PX 

.228493261 

.1880008#1 

-.179596*#! 

.1**5*5961 

-.181627061 

.688959* 

-.2387336 

.3375595#-! 

.29158566-1 

.32203926-1 

-. 1017121 

. 15*8753 

-.1*89085 

.1*6229* 

-.97*65*9#-! 

.321968*6-1 

PE 

.2*76707#! 
. 160800061 

-.176812161 

.171388161 

-.153395*61 

.12991*961 

-.1025217#! 

.8195718 

-.568*309 

.*55762* 

-.2262559 

. 1996797 

-.1198*96 

.*5172726-1 

-.22056376-1 

.33889266-1 

IH 

.6985018 
. lOOOOOOel 

.2738923 

.18820*261 

.42*3288 

.863994C 

.6861938 

.58*8877 

.5591835 

.2*86*45 

.2765797 

.355*765 

.3171888 

.19156726-1 

.67638186-1 

.13579886-1 

PP 

.7672659 
. 188888861 

.27537*8 

.7224589 

.3563328 

.282*728 

.8663*86 

.6*89116 

.5757259 

.285*831 

-.46828256-1 

.2939971 

.1*19888 

.1955692 

.2881737 

-.127*257 

80 


i4  ■ n .i0*i*73(}i 


. laeeoooei 

-. 1840C0291 

. 9640988 

117936991 

.4358865 

-.5844048 

.8864014 

-.4076996 

.5097568 

-.7423652 

EH 

.2589478 

.256847811 

-.3263575 

.1954818 

.1221569 

.46744569-1 

. 186888081 

-. 177505791 

. 1597873*1 

-.  1^5^882*1 

, 8558688 

-.5145747 

,7767279 

-,S831!fi5e  I 

.42068859-1 

.44318429-1 

U 

-.64701118.2 

.iei4->078l 

-.11478089- 

.52006019-2 

-.469826*9-2 

.20714769-1 

.109000081 

-.6719078 

,71967690  1 

-.7023114 

-.1419251 

.4006761 

-.85579469-1 

,5'7/567 

-.1391835 

NX 

-.86''O?0Se  1 
.135655181 

. 1774109 

. 77283679- 1 

-.195422? 

.95649599-1 

.180006081 

-.  148612191 

.803’7'’8 

1848488 

-.3841723 

.3753657 

-.17688829-1 

-.2055122 

,400124’ 

-.3883483 

L 

. 1373754 
.5973559 

.19'^  78, 

-.58!7C2!9-1 

. •21.'8'2 

-.44144339-1 

. 180000081 

.33588089-1 

. 1959081 

-.3474187 

-.26678819-1 

.4557708 

.2308812 

.90  (0.1..  3 

. 16360488-1 

.2178133 

UU 

-.1998036 
. 13‘’1596el 

-.1159052 

.232367! 

. 1849393 

.2848662 

. 100080081 

-.128284691 

.8484092 

.102138991 

. 1987648 

.175273,3 

-.  1584844 

.659)436 

-.5427459 

.3802.‘,87 

Y 

-.5952312 

.164664381 

.4059821 

-.75469139-1 

.1834871 

-.1151173 

.188800881 

-.2152640 

.648481=^ 

-.1191'9791 

-.4156332 

-.3173563 

.23402079-1 

.5350594 

-. 12957E9 

. 2749816 

ER 

-.2908673 

.286849181 

.1482831 

-.71924829-1 

.83564369-1 

13229809-1 

. 188000081 

-.171358991 

.112243091 

-.5752272 

.76728699-1 

.4391078 

-.7811429 

.7514453 

-.6108526 

.4479258 

B 

-.3275888 

.4870763 

.1898355 

-.o9670319-1 

.12191561-1 

. 18266089-3 

. 100800081 

.7175213 

. .'511603 

-.4782538 

-.6599186 

-.4863518 

.3303471 

.3435398 

-.1423548 

-.2939424 

OH 

.1343774 

.5824395 

.2331645 

.22816H9 

.2199586 

-.48788819-1 

. 188880881 

-.6374152 

.9040350 

-.5291229 

.2992921 

. 1652888 

.14828319-2 

.2494591 

- 65470759  1 

.4329274 

lY 

-.1077868 

.228443381 

.1082039 

-.2301427 

-.56124789-2 

.45801499-2 

. 188000081 

-.118262691 

.134501291 

-.129491791 

.4543056 

-.7928761 

.2277431 

-.52331276-1 

.1429815 

.1357118 

r 

-.11705428-1 

.6091735 

.56815499-1 

-.40291329-1 

.17819039-1 

-.73274879-2 

.180888881 

.4107559 

.8780966 

-.1209926 

-.1568682 

.1864959 

-.1758721 

.4450189 

- 20357279-2 

.2488879 

HH 

,1895976 
. 1183437H 

-.57638519-1 

.1489330 

-.82305289-1 

.1188176 

.188088891 

.31418149-1 

.181793891 

-.8251846 

.22811969-1 

-.7516666 

.2753571 

-.3162514 

.485613! 

-.3479637 

P 

.3361489 

.133988281 

-.2939970 

.2172461 

-.81604779-1 

.81942869-1 

.188088891 

.6483634 

.152693891 

.6169871 

.118675491 

.8869688 

.7939511 

.6470248 

.3327926 

.4705566 

.2325033 

.2486448 

.1086712 

.69988439-1 

.42885839-1 

81 


ou 

.6662480 

.leeeeeeei 

-.1455828 

.94771629-1 

-.1433321 

-.29566289-1 

.61661129-1 

SH 

.269923991 

.leeooeeei 

.169183391 

-.1456324 

-.1861294 

.1534698 

.1247145 

UH 

.211918891 

.180880091 

-.7463413 

.4388554 

.1837626 

-.42086869-1 

.1475354 

RH 

.132366491 

.186860691 

-.7598235 

.2142586 

.4420243 

-.1197412 

.4143636 

-.5652606 

-.2183936 

.6566666 

.5364356 

.1393861 

-.33515659-1 

.34988489-1 

.64946319-1 

. 1598899 

.124677891 

.5672762 

.91173459-1 

-.23658839-1 

.67877349-1 

.1791986 

.66444969-1 

.26289939-1 

.58679759-2 

.2884738 

-.126816791 

.4646748 

.16172279-1 

-.4935656 

.1932944 

-.75277679-1 

.75636629-3 

.66931619-2 

.4963862 

-.133928691 

.3519968 

.1161817 

-.5935886 

.11763269-1 

-.37491459-1 

.65626659-1 

-.1898726 

82 


A.3.  Auiocorr«la|jon  V«ciort  for  Phont  R«f«r«nco  PatUrns 


V 


N 


0 


no 


K 


c 


s 


T 


nx 


nt 


IH 


nn 


.leooeoevi 

. U64  76,’ 
.21Vk2: 

-.•’288125 
, 1349749*-! 
.52513&1*-! 

1144131 
37»  4852 
06  2 

5666456*-! 
.78!  0 S*-l 
.631  le-1 

. 1614015 
. 1371316 
. "218621 

. lofionneei 

-.:?60983A 

-.9801593*-! 

.545247] 

5226 16h*  1 
-.1737180 

. ! ’01£80 

.■.’69H70-1 

.29.;J751 

- 5f  '.633W.J 
-20 ’501" 
48998-1 

-.5032854 

.2787059 

-.2368247 

.1000000*1 

-.1538444 

-.2209182 

. 1684182 

.2341701 
. 1699807 

-,3''50'-6'.’ 

.25418S’’ 

..‘0  3R59'J*  .1 

1845842 
. !714'J46 
1164862 

.5868284 
-.2714685 
- I510’!3 

. 1000000*1 
.2134794 
-.2939558 

.4697946 
-. 1889920 
-,  • 164846 

-.I325-18*  1 

.3]8;’99^. 

. 988' '>’'2fc-2 

’.132105 
H0458O 
-.  23'”384< 

.6452802 

-.>24e400e-l 

-.12,14474 

.1000000*1 

-.5964925 

.7445676*-! 

.613646 
- 5846240 
.1550786 

.247725H 
-.5'=  '2  341 
.3Ci5794 

.B4?.’505*  i 
- ■>64 ’581 
. 1579057 

-2798560 

.5985912*-! 

.B943268*-! 

.looooooat 

-.62734  10 
.3516739 

-.2443574 

-.4»67868e-2 

-.857587?*-2 

- .7759651 
-.29723'-p 

.5122135 
-, ;256464 
. 48!5n0 

.4177193 

-.2199533 

.2110758 

. 1 006600*! 

.2520509 

.2695171*-! 

.3196531 

.1196185 

.2377262 

.1996  77 
.3514191 
.1264236 

.4710756 

.•’941729 

5253414*-? 

.6013976 

.1660434 

.5614547*-! 

. 1066606*1 
-.4076372 
-.3756318 

-.4747483 

.6513632 

-.1842743 

-.4846925 

-.3.9u.'3v-e 

.6106206 

8304715 

,69919 

-.4263494 

-.3148533 

.5483262 

-.2848856 

. lOOOOOOel 
-.5236049 
.1413873 

-.4020676 

.4667192 

-.2217581 

-.5610065 
. 8076257*. 1 
.5203889*  1 

-. J723B47 
.5920076*-) 

-.1473374 
. 1942893 
-.B631G11*-! 

.1066666*1 

.8522591*-! 

-.5816178 

.6998251 
-. 1308032 
-.591687? 

.316:;,'5S 

-.28£!705 

-.3823164 

. P/64444 
385137S 
371515 

. 1375339*-! 
-.•'417292 
.5075869#-! 

.1686666*1 

.1438057 

,4768901 

.6073816*-? 

.5539806*-2 

.2918202 

-.2984334 

-.2532339 

.1596943 

.4197423 
-.  J283950*-! 
-.34038"! 

-.4369415#-! 

-.3049655 

.7956751#-! 

,1060000*1 

.3596229 

.3696094 

.2842963 

,1335819 

-.3045219 

-.1106172 

-.1779676*-! 

.1064142 

-.4212102*-! 

.9935820*-! 

.5760817*-! 

-.4999777 

.1692662 

.3318786 

.1906600*1 

.5492872 

.2I250188-1 

.6409851 

-.7374097 

.2811722 

.2493692 

-.6377681 

.4589548 

.5459470*-! 

-.3757167 

.5124494 

-.1844885 

-.1648717 

.5130397 

83 


EC 


M 


NX 


L 


UU 


r 


ER 


OH 


lY 


r 


HH 


P 


.ieeeeoe«i 

.1876337 

-.687922*6-1 

1 

.33*7378 

-.37897198-1 

.217**88 

-.65282716-1 

.288*938 

.17*9689 

.*173726 

.3*90*89 

-.2697851 

.582*565 

.13396*5 

-.1358399 

L 

.188888061 
-.*299389 
-. 19831*2 

1 

-. 1582835 
.1772878 
. 18385*56-2 

-.5871161 

.13887886-1 

.36*9381 

.5*2*689 

-.23173*7 

-.2889*62 

.1165738 

.*2753286-1 

-.2686*83 

L 

. 1800000«1 
-.19828836-1 
-.279797* 

1 

.89*8516 

-.28625*’ 

-.1698156 

.7268838 

-.3533348 

-.35*31266-1 

.5269778 

-.3986995 

.*688*826-1 

.2*59*62 

-.3*65359 

.59615816-1 

1 

. 1118888861 
-.77625866-1 
-. 19992886-1 

1 

.*85*3** 

-.1*85668 

-.182*813 

.21578166-1 

.25698686-1 

-.2986752 

.3081688 

-.83*19726-1 

-.2*98006 

.3**6198 

-.1483197 

-.187679* 

L 

. 188888861 
-.3858951 
-.688**816-1 

1 

.583811* 

-.5868237 

.168359* 

.*136782 

-.6296178 

-.52655626-2 

.216188* 

-.2781713 

.56583296-1 

-.2681899 

-.2*57883 

.375*2336-1 

i 

.188808861 

-.28282*8 

.*5511736-1 

1 

.5851369 

.68661256-2 

-.31*9253 

.21585*6 

-.14*7745 

-.2616310 

.*369369 

-.2657272 

-.1763368 

.82828796-1 

.78298326-1 

-.3781573 

1 

.188088861 

-.381*387 

-.72989936-2 

-.713*5276-1 
-.2*53598 
. 1889839 

-.6863*66 

.328164* 

-.19823*3 

.4319392 
.168*785 
-. 1239536 

.56*6289 

-.2*98311 

.2263288 

.188888861 
-.  1556628 
-.*659607 

.79*1**2 

.18371*56-1 

-.51*36*3 

.3*56068 

.63958276-1 

-.3968895 

-.58*72*86-1 

-.616*9886-1 

-.28859*6 

-.236*898 

-.2753992 

-.5525*326-2 

. 188000861 

.39*7312 

.1656677 

.2753762 
.338' ’88 
.181*58* 

.*817322 
.*8*8386 
. 1653826 

.*9265*6 

.216218* 

.617558*6-1 

.*885*9* 

.3918326 

.139*3*9 

. 108008861 
-.3662128 
-.  18793196-1 

. 18232356-1 
-.118858* 
.1**22** 

-.*58*18* 

.1*81993 

.1183181 

.2955278 

-.93062676-1 

-.11533636-1 

. 1696581 
-.2*67663 
.9*892586-2 

.188808861 

.2959362 

-.222289* 

-.778*8886-1 

-.71*59086-1 

-.2588’7* 

-.6772*31 

-.6885622 

-.59826*16-1 

. 12817*8 
.208*718 
.396*8776-1 

.2821922 

.5*38381 

.3197732 

.188008061 

-.3281671 

.38377666-1 

-.2338883 
. 18657336-1 
.**978**6-1 

-.6168338 

.1125261 

-.9*325206-1 

.*312937 

-.128*6396-1 

.98517796-1 

.286*127 

-.92786876-1 

.12271696-1 

.188080861 

-.2876862 

-.19296216-1 

-.1556851 

-.235519* 

.37192716-1 

-.792*881 
. 1837326 
-.6559*176-1 

.3265335 

.78360926-1 

.26112656-1 

.*916686 

-.68015726-1 

.1866373 

.188888861 

-.2808329 

-.27982*2 

-.*887il8e-2 

-.58122*2 

.*588*21 

-.5266*92 

.18439*5 

.21*9*82 

.2886876 

.*939695 

-.2538286 

.2837898 

-.2938778 

-.1581857 

. leeeeeeti 

.6226916 

.2667666 

.5995629 

-.51735)87 

.4467311 

.3328667 

-.5552674 

.3675776 

.97676566-1 

-.1119559 

-.66876146-1 

.166688661 

.2613658 

.5778152 

-.1455661 

-.5284975 

.3254668 

-.8664353 

.1763847 

.4899869 

.2948325 

.5616964 

-.414(259 

.186666601 

.5763466 

.88864486-1 

.6646612 

-.5733147 

.4673686 

.4686S;2 

-.6978378 

.4686411 

.2556887 

-.4178742 

.2926596 

. 188686861 

.44696236-1 

.5576211 

.2282843 

.78429946-1 

-.1676417 

.87347796-1 

-.3295973 

-.1763546 

.5485922 

.47636856-1 

-.5275783 

-.S099286 

.2455174 

-.1961242 

.6462193 

-.2666262 

-.3253974 

-.3291388 

-.87565376-1 

.2459644 

-.55539866-1 

-.25812696-1 

-.1783641 


I 


A. 4.  Phonetic  Ambiguity  Matrix  - Itakura  Metric 

X C55  CM  m u)  m m 00  00  (D  {\NT  cn  ® CO  00 f'i  (D r M n (M  CO  CD  r-)  CD  vt  n 

r-t^cDQ^J^'-^^o  ^rjco^^  ^cm  m 

f-H 

z CD  U3  in  CD  r S)  CD  CO  CD  n tx  «ct  CD  Q N cn  03  CM  mm -<  CM  01 CD  m ^ CD  r-(  00 

^^^m’-'OJcvi^CM  com®  CD CM  CO  ^^^cm  ^vJCM--trvi 


n m cn  CD  cn  CO  o J CO  m CD  CD  ^ m CO  oo  S3  ® CD  CO  m CD  m 00  CO  n CD  00  CD  CM 
.--icnico— im^<t  c'l^^j  r-tco®mm>-'— <r-Jco  CM.-<-Hro  ^cm^cm^ 


X m CD  ^ -.t  CD  cn  CO  N -4  CO  CD  ® -t  CO  CM  CD  hv CO  CO  <r  CD -t  CM  00  cn m 
x-H^rsim^r-JC'.im  m^cmcdcmcm--<  c'l  ^ ^r->  co 


X ■-j-corHm.-<cnCsi— •— imooo3m--<*-<— <Q— <CMr-tQQQr-tr-(Q^r>jcM<r  t-< 
tn  CM  CO  CM  <t(Drj 


85 


tn  m CM  ^ ^ CO  CM  m ^ r V S3  ^ ^ CM  CM  ^ S3  ^ CM S3 — • — < S3 CO  CO  CO  •-• 

CO  rj  S3CMCM 


X CO  cn  CM  CM  m M 00  in  CD  CD  S3  CM  m 00  S3 -t  cn  to  CD  —I  to  CO  CO  00  00  CD 
CD  CM -J- CM  CO  CM  CO  CM  CD CD  I CO  CM  CO  CO  OJ— <CM  -tCMCOCM  COCOvM  r-( 


>.  cv  cv  ^ 00  r ) <r  <o  ro  («j  s) . j.  ^ in  ro  00  (s  CD  m <f -H  CM  00  CM  CD  D CD  CD  CO 

^ ^ ^^CDCM  —I  CMCMO-i  CO— iCOr-irHOMCO-t  CO  CO 

^ — < 

i 


U.  CM CD  DJ  m cn  CO  CO  00  0 ) • rv  — ( CD  CO  0 3 CD  CD LD  hv  cn  sj  CM  CD -J- 00  rv  00 
CM'o— 1-^ . -im— 'CD  m— 'CMto— 'OJ— < n— < r-jcM^— < 


o m c^  m CM CD  00  m CO  CD  CD  CO  00  CM  <t  00  00 CM  ^ sj  m cn  00  CM  1^  CD  S3  CD 
^fs,^in0jCD<r  ^ — (COCOM'CM  CM-^  ^ 


V CO  — I CD  fv  vt  C3  r D <f  CO  00  m c^  CD  CD  CD  m ^ <r  CO  ^ CO  CO  m <r  ^ CD  ^ cn  ^ CM 

— ICM  <i  CH  US  rMC'lCOCD  ^ ^ 


CD  o 1 CD  m CD  CD  <f  CD  CO  00  CO  CM  CO  00  cn 'M  CO  m CO  cn  m CD  CO  00  00  CM  m CD 
CM— ICO  CD-'mCM  -t  CM-^  <rmCM— i-^(M  ^CM^  ^ —4 


3— — I CD -<r  CO  CO  - f <f  cn  CO  cn  S3  01  S3  <r  <r  CO  CD  CM  m CM  CD  CM  CM  cs|  CO  S3 r^  CD  CD  — I 
• — I ■ — I CD  ' t <f  —I  '3”  CO  ' — I 


CO  cn  CD  CD  CO  CT1  — I CD  00  00  CD  G3  CD  01 00 CO  CD  00  CM  ^ <r  m CM  ^ CM  m CO 
m^tDC'jrj(Mf^<r  cmcocmcm 


CL  CD  CO  00  ro -d- vt  CD  C'.  00  01  r- CD  M- m m ^ — I m CM  — 1 01  ^ m ^ t D CO  CD  m 
® r^i  c'j  m vt  ^co  ^ — «^m 


I CDcocDc^CDr  cDCDoopim^roioocMCDrMCDcncD^cDmNrv^oocncnm'd- 

G3  CM  CO —I —I —4  CM  CO  • — i ' — i ■ — i ■ — i —4 


I acDt~DvOLL>  xmxinzx3_i>3xxo<xxujxx>x 
o mx  z d:30<<<uj<uj— *-< 


X M vt  00  U3  N S)  cr> --H  00  (\i  n n h- K (Nj  (Nj  ^ f ) in  m CD  00  n 00 u)  ui  CO 

■ — t f\i  fH*— tUKSJ  f-Hf— f— 1 1— (rH 


> vt  .-<  00  K LD  63  (\l  !M  rO  (SJ  vt  ^ 00  (71  vj- f 0 (’I  N - 4 M CSJ  n ID  U3  BB  to 

I—  .-(  ^(si^,-h(m  CNi  r-trg.-tn.-t  ui  •-<  .-trg  co 


I tn  rg  00  (T)  00  rg  01 00  r V r-t  (M  in  (T  01  to  (\j  n n r- cn  • t CO N n O')  rH  UD  CS1  (T)  n 

— r-t  n r-t  rH  n w-i  L/1  w-i  rH  • — t » — t (NJ  fNj  O)  L/1  r-t  r ! rH  CO  ■ — I 


X rg  rg  CD  01  u)  U3  (o  00  rg  U5  u)  00  o)  r-i  (71 01  (M  <t  c ) ® CO  n to  LD  ea  in  uno 
LU  rg  r-tr-t  rHr-trg  r-4r-tnn  r'in  ,-trg  vj®rHr-trg 


igj  M r-t  in  vt  00  m rv  r-t  M in  n S3  rg  to  <x  j CO  CD  N.  .n  c' j -I  Lii  m ta  to  00  <»• 
< n .-tr-t  r-tr-tn  rHcgncNj  r-t— i -in  coto.-i  rg 


ttrg«4-nnoon.^  toinear-i-inrvu).^  con-tLoo",  oioiootBi/'-j-rgnvt 
loj  nr-i  -trg  rg  o -t  ri 


X rg  n 01  n 01  in  to  r-4  h.  to  cs  CO  00  00  00 oi  ji  I n S3  in  to  (s  01 N S3  n n r-i  n to  Lfi 
<r-in— iMrgrHrgnr-4nr-i-4-4nrgf\-irgrgn  rj-gfgtn  -t—irg  —i 


< to  00 01  n CO  to  .-g  01  in  — I to  01 00 -4  in  n to  S3  ca  in  to  S3  <}■  n N n 01  rg  rg 
— t n — 4 — t r-t  — t n rH in  (nj  — t « — t —4  rg— t— t rg ui oj *- < » — 1 . -4 n *—4 ■ — t 

—I 


o .-4  00  S3 -4  <t  to  n 00  to  rg  to  00  n 00  r-i  .i- B)  ca  m N ui  S3  (^  in  ro  rg  S3  h,  0)  to 
<— 4-4—1— irH— irgrg—i-.t  rg— I— IN  n— 1— 1 Ncoto--i  "hnn  — i 

rg 


3 in  01 -4  01 N 00  in  in  N in -4  to  n 00  to  N N n r g S3  to  in  m CD  to  (^  n (71 

ON  r>(  -g  rgr-4NN  -4  -I— Ito  N-tB3rgNN  -i  N rg 


X in  00  in  00  CO  N n 00  00  to  N in  n rg -4  n 01  in  in  CD -4  N 01 N r • to  CO  rg  to 

ZD  — t I — t ' — I — t • — t —4  rg  —4  n —4  — t S3  rg  —4  — t rg  —1 


3(7110.^ —Iin.^ O10000<7t0N.»  OOCD'^NlOrgCD.^  tor^oorv  NNOINOO 
Z)  -g-iNN— irgNrgn  -4NN.»  nn-gtOrgnNNn  -4-4-4-4rg 


>(71 N to  in  N in  in  CD  n in —I 00  in  n in  in  CD  01  rg  n n IT; -I -4  uun  to  ui 
-t-tNnnN.^  N-iNnNn-g  S3  rg  rg-i  n 


_j  CD  to 00  N (71  to  n 0 n to  00  00  to  00  CD  01  n n -4  to  CD  N n 01  rg  00  N ui 
rg  rg .— 1 1— t rg  — 4 — t n —1  • — t ■ — i ? — t rg  ca  N—4ton'g'N  rg.— 1—4  —4 


3 to  N in  CD  in  h.  01 -4  .g- to  N 01  n to  (S3  in  ID  N N 00  01  n sj- .T- 01  rv  OHO  rv 

rHrgNrgN  NN-4N  NrgNCDn  >#rgNNnN  -4  -4 


I Q.tD3-axaii.>X(nxxn2:x3-j>3X30<xxujxx>x 
a (nx  Z 330«t<<UJ<UJ-g— < 


Appendix  B:  Phoneiic  Ambifuity  - ArtIcMialory  Model 


B.l . Articulatory  Foaturoi  and  Allowed  Valuoi 


1.  Vocal  Tract  Closure  0-  open 

C-  closed  or  constricted 
T-  turbulent 

2.  Vocal  Chords  V-  vibrating  (voiced) 

U-  not  vibrating  (unvoiced) 

3.  Nasal  Cavity  0-  open 

C-  closed 


4.  Tongue  Position 


5.  Tongue  Height 


G,  Tongue  Tip 


7,  Lips 


B-  back 
C-  central 
T-  front 

L-  lou 
M-  medial 
H-  high 

f1-  moving 
N-  not  moving 

N-  normal 
C-  closed 
R-  rounded 


88 


B.2.  Definition  of  the  Phone*  in  term*  o'  their  Feature  Values 


IV 

VOICEO 

OPEN 

FRONT 

HIGH 

IH 

VOICEO 

OPEN 

FRONT 

HIGH 

EY 

VOICEO 

OPEN 

FRONT 

HID 

EH 

VOICEO 

OPEN 

FRONT 

mo 

RE 

VOICEO 

OPEN 

FRONT 

LOH 

RR 

VOICEO 

OPEN 

BRCX 

LOH 

RH 

VOICEO 

OPEN 

CENTRAL 

nio 

RO 

VOICEO 

OPEN 

BfiCK 

IDl' 

DU 

VOICED 

OPEN 

BRO 

110 

UH 

VOICEO 

OPEN 

BflCL 

HIGH 

Ull 

VOICEO 

OPEN 

BflCF 

HIGH 

RV 

VOICEO 

OPEN 

CENTRRL 

nin 

IX 

VOICEO 

OPEN 

FRONT 

HIGH 

ER 

VOICED 

OPEN 

CENTRRL 

nio 

TIP  nOVEHENT 

RU 

VOICEO 

OPEN 

BRCX 

LCU 

RY 

VOICEO 

OPEN 

BRCI 

-OH 

OY 

VOICED 

OPEN 

8RCF 

HIGH 

ROUNDED 

Y 

VOICEO 

OPEN 

CENTRAL 

HIGH 

TIP  nOVEHENT 

ROUNDED 

U 

VOICEO 

OPEN 

CENTRRL 

niu 

ROUNDED 

R 

VOICED 

OPEN 

CENTRAL 

nio 

CLOSED 

L 

VOICED 

OPEN 

CEHIRRL 

niu 

TIP  nOVEHENT 

n 

MfiSRLIZEO 

CLOSED 

NRSRL 

FRONT 

nn 

CLOSED 

N 

NflSRLIZED 

CLOSED 

NRSRL 

OENTRRl 

HIGH 

TIP  nOVEHENT 

NX 

NRSRLIZEO 

CLOSED 

NRSRL 

Bflcr 

LOU 

P 

UNVOICED 

TURBULENT 

FRONT 

DIO 

CLOSED 

T 

UNVOICED 

TURBULENT 

CENTRRL 

HIGH 

TIP  noVEflENT 

X 

UNVOICED 

TURBULENT 

BHCX 

HIGH 

B 

VOICEO 

CLOSED 

FRONT 

mo 

CLOSED 

0 

VOICEO 

CLOSED 

CENTRAL 

HIGH 

TIP  nOVEHENT 

C 

VOICEO 

CLOSED 

pncx 

HIGH 

HH 

UNVOICED 

TURBULENT 

BBC* 

HIGH 

F 

UNVOICED 

TURBULENT 

FRONT 

P!U 

TH 

UNVOICED 

TURBULENT 

r ONT 

HIGH 

T'P  nOVEHENf 

S 

UNVOICED 

TURBULENT 

CENTRRL 

HIGH 

SH 

UNVOICED 

TURBULENT 

CENTRRL 

nio 

V 

VOICED 

TURBULENT 

FRONT 

mo 

OH 

UNVOICED 

TURBULENT 

FRONT 

HIGH 

TIP  noVEHENT 

2 

VOICEO 

TURBULENT 

CENTRAL 

HIGH 

ZH 

VOICED 

TURBULENT 

CENTRRL 

mo 

CH 

UNVOICED 

TURBULENT 

CENTRRL 

HIGH 

TIP  nOVEhENT 

JH 

VOICED 

TURBULENT 

CENTRRL 

HIGH 

TIP  noVEHENT 

HH 

UNVOICED 

TURBULENT 

BRCX 

mo 

EL 

VOICEO 

OPEN 

FRONT 

HIGH 

TIP  nOVEIlENT 

Eh 

VOICEO 

OPEN 

NRSRL 

CENTRRL 

mo 

CLOSED 

EN 

VOICED 

OPEN 

NRSRL 

CENTRRL 

mo 

TIP  nOVEflENT 

OX 

VOICED 

CLOSED 

CENTRRL 

mo 

TIP  NOVEHENT 

0 

UNVOICED 

TURBULENT 

CENTRRL 

mo 

- 

UNVOICED 

CLOSED 

CENTRRL 

mo 

CLOSED 

VOICED 

OPEN 

CENTRAL 

niD 

J 


89 


B.3.  Influence  Coefficients 


Vo i ced 

Unvoiced 

4.0 

Voiced 

Ncosa  1 i zed 

0.2 

Unvoiced 

Nasal ized 

6.0 

Open 

Closed 

8.5 

Open 

Turbulent 

7.0 

C 1 osed 

Turbulent 

4.0 

Nasal i zed 

Non-nasal i zed 

2.5 

Front 

Central 

1.0 

Front 

Back 

1.0 

Centra  1 

Back 

1.0 

Low 

Middle 

1.0 

Low 

High 

1.5 

Middle 

High 

1 0 

Tip  movement 

No  movement 

0.4 

Rounded 

Normal 

0.2 

Rounded 

Normal 

0.2 

Closed 

Normal 

0.3 

Note!  Those  coefficients  are  somewhat  ad  hoc  and  are  likely  to 
change  over  the  next  feu  years.  Anyone  wishing  their  current  values 
should  contact  the  author. 


. 


90 


B.4.  PhOMtic  Ambiguity  Matrix  - Thaoraticii  Modal 


X 

z 


tDfnnntnnnMnnnnnnnn 

(S 


I 

X 


(BincoMconnnnMnnnnnnfo 


® eo  <t  rc  M M M M M tn  M M fo  fo  n fo  fo  fo 

®MM 


X 

tn 


® vS- 


tn 


CD  OO  OO  i * — i r—t ' — I * — i * — i » — t r—i  * — i r—i  * — i f — I » — t * — ♦ 

eau>u> 


X 

a 


®ncoc^f^jc^c*j^i— itHtHi— <1  ii  1 1"<  1— < 1 I r 

®®^in 


® ^4  C-J  r>J  M 00  W W N N hv  t N.  N h>  h- h- N N h' 

® (M  (M  <M  f'J  *-1  »H  iH 


® fSJCO®  ® ® r-l  »-(<H 

®(^J®vt®fO 


u 


ra  in  f'j  in  imn  in  »-<  ® CO  «<•  «!f  vj- •<r  <r  scf  .j- ^ 


X 


® cncgin®>»® 


91 


♦# 


X 

CO 

< 

CD 

> 

(SU) 

CSst 

X 

(sesu) 

CSCSst 

X 

03  00  00  00 

UJ 

03  U)  COLD 

UJ 

690OU3(DCO 

< 

03  U3 1/3 1/3^ 

X 

Gsroncnroro 

UJ 

ojstuj-t^^-cn 

X 

G3nU30OU)Uia3 

< 

< 

O3U3nU30O(\/NU3 

< 

03>^  Lnnnrost 

o 

O3(su3mu3oorgrsju) 

< 

ojOstxtLnroMnv* 

03 00  00  00  CO  00  to  00  00  oo 

a 

CD  U3  U)  U3  U3  fO  1/3  n CO  U) 

.H 

I 

CD  00  U3  U3  C/3  n (NJ  00  U3  C/3  U3 

Z} 

CD  C/3  C/3  C/3  St  n n C/3  C/3  Nt 

3 

CD  CD  00  C/3  C/3  C/3  n rg  CO  C/3  C/3  C/3 

Z) 

CD  CD  C/3 1/3  C/3  Sf  Sf  n n U / C/3  Sf 

> 

CD  00  00  CD  n n 00  ro  0*3  CD  00  00  00 
CD  c/3  c/3  St  n n c/3  c/3  n stf  c/3  c/3  c/3 

_J 

CD  n n n n n n n CD  n c*3  ro  r*3  f*3 
CDC/3stsi-C/3s»'t(T)CD«»C/)<t'»CJ3 

2 

CD  C/3  ro  ro  ro  ro 00  0*3  ro  CD  ro  ro  to  0*3  0*3 
CD  00  CD  si- st  CD  si- st  03  00  CD  St  cn 

I Q.mt-ai^uu.>xtnxxE:2x3_)>-3xno<tia:ujn>-x 


Appendix  C:  TASK  DEFINITIONS 


92 


This  appendix  contains  descriptions  of  the  languages  analyzed  in  this  thesis. 
Each  description  consists  of  a definition  of  the  syntax  of  the  language  and  a dictionary 
for  it’s  vocabulary.  Dictionaries  give  the  allowed  pronuncations  for  each  word  in  the 
vocabulary.  The  first  four  tasks  are  not  truely  languages,  but  are  sets  of  words  w# 
wished  to  analyze.  They  have  been  given  a simple  syntactic  description  which  allows 
any  word  to  follow  any  other  word. 

PHONS  is  a langoage  consisting  of  a set  of  33  phones.  Describing  the  phones  as  a 
language  makes  possible  the  same  analysis  as  for  any  other  vocabulary.  This 
means  we  can  calculate  the  effective  vocabulary  size  for  the  phones. 

DIGIT;  This  vocabulary  is  the  10  digits.  It  was  included  because  it  was  one  of  the  first 
vocabularies  used  in  speech  recognition.  It  is  still  used,  although  usually  for 
comparative  purposes. 

ALPHA:  This  vocabulary  is  the  spoken  letters  of  the  alphabet.  It  is  highly  ambiguous 
phonetically  rnd  is  therefore  a good  test  case. 

ADIG:  Is  the  combination  of  the  10  digits  and  Ihe  26  letters.  Having  this  vocabulary 
allows  one  to  evaluate  the  effect  of  combining  two  vocabularies. 

CHESS:  The  original  Hearsay  I chess  task  language.  It  has  a vocabulary  of  25  words. 

LIZ:  Lizard  is  a small  voice  programming  language  w>th  a vocabulary  of  17  words.  It 
has  been  used  in  the  Harpy  speech  recognition  system. 

VP:  This  language  is  also  a voice  programming  language.  It  has  been  used  by  both  the 
Hearsay  I system  and  the  Harpy  system.  If  is  richer  in  it’s  syntax  than  Lizard 
and  contains  37  words. 

IBM:  This  is  the  IBM  "New  Raleigh"  Language.  It  describes  syntactically  correct 
English-like  sentences  with  little  or  no  semantic  interpretation. 

LLBAS:  A language  developed  by  Lincoln  Labs  for  use  with  their  speech  recognition 
system.  It’s  task  is  displaying  and  controlling  acoustic  data.  There  are  236 
words  in  it’s  vocabulary. 

ILEXT:  An  "extended"  version  of  LLBAS  containing  fllO  words. 


! 


94 


C.i.2  PHONS  Dictionary 


95 


C.2.  Dicit  Lan{ua{« 


C.2.1  DIGIT  Syntax 


<S>::-  [ <WORDS>  ] 

<WORDS>;:-  <WORD>  <WORDS> 
<WORD> 

<WORD>::-  ZERO 

ONE 
TWO 
THREE 
FOUR 
FIVE 
SIX 

SEVEN 

EIGHT 

NINE 


96 


C.2.2  DIGIT  Dictionary 


ZERO  (-.0)  S (-,0)(1H,1Y)ER0W 

ONE  (-.0)  W AH  N 

TWO  (-,0)  T (-,0)  IH  UW 

THREE  (-,0)  F (-.0)  ER  lY 

FOUR  (-.0)  F (-.0)  AO  ER 

FIVE  (-.0)  F (-.0)  AA  (AX.IH)  V 
SIX  (-.0)  S (-.0)  IH  (-.0)  K (-.0)  S 
SEVEN  (-,0)S(-,0)EHV(EH,AX)N 
EIGHT  (-.0)  EH  (1H.AX)  ( -,0)  T 
NINE  (-.0)  N AA  IH  N 

[ 

] 


C.3.  Alphabvt  Languac* 


C.3.I  ALPHA 

<S>::-  [ <WORDS>  ] 

<WORDS>;:-  <WORO>  <WORDS> 
<WORO> 

<WORD>;:-  "A" 

"0" 

"C" 

"D" 

"E" 

Mptf 

"G" 

"H" 

h in 
I 

"j" 

"K" 

"M" 

"N" 

"0" 

«p“ 

"0" 

"R" 

"S" 

MyM 

"U" 

"y" 

"W" 


C.3.2  ALPHA  Dictionary 


"A"  (-.0)  EH  (IH.AX) 

"0"  (-.0)  0 lY 

"C"  (-.0)  S lY 

"D"  (-.0)  D lY 

"E"  (-.0)  lY 

"F"  (-.0)  EH  F 

"G"  (-.0)  G lY 

"H"  (-.0)  EH  (IH.AX)  (-,0)  T SH 

"I"  (-.0)  AA  IH 

*T  (-.0)  D SH  EH  (IH.AX) 

*’K"  (-.0)  K EH  (IH,AX) 

"L"  (-.0)  EH  L 

"M"  (-,0)  EH  M 

"N"  (-.0)  EH  N 

“0"  (-.0)  OW 

"P"  (-.0)  P lY 

“0“  (-.0)  K Y UW 

"R"  (-.0)  AA  ER 

"S"  (-.0)  EH  S 

"T"  (-.0)  T lY 

"U"  (-.0)  Y LIW 

"V"  (-.0)  V lY 

”W"  (-.0)  D AX  B ((EH.O)  L,0)  Y UW 

"X"  (-.O)  EH  K S 

y (-.0)  W AA  IM 

-z"  (-,0)  S lY 

t 
1 


99 


C.4.  Alphab«l-Oi|i(  Lanfuag* 


C.4.1  Aiphab4t-Di2it  Syntax 


<S>::-  [ <WOROS>  ] 

<WOROS>::-  <WORD>  <WORDS> 
<WORD> 

<WORD>::-  "A" 

"B" 

"C" 

"0" 

T" 

(j 

"H" 

NjW 

"J" 

"K" 

"L" 

"M" 

"N" 

"0" 

np» 

"0" 

"R" 

"S" 

"T" 

"U" 

"y" 

’•W" 

"X" 

•#Y'* 

"Z” 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 


100 


C.4.2  Alphabet'Digit  Dictionary 

"A"  (-.0)  EH  (IKAX) 

"B"  (-.0)  B lY 

"C"  (-.0)  S lY 

"D"  (-,0)  0 lY 

"E"  (-.0)  lY 

"F"  (-.0)  EH  F 

"G"  (-,0)  G lY 

”H"  (-.0)  EH  (IH.AX)  <-,0)  T SH 

"I"  (-.0)  AA  iH 

"J"  (-,0)  D SH  EH  {IH.AX) 

"K"  (-,0)  K EH  (IKAX) 

"L"  (-,0)  EH  L 

"M"  (-,0)tHM 

"N"  (-.0)  EH  N 

"0"  (-,0)  OW 

"P”  P lY 

"Q"  (-,0)  K Y UW 

"R"  (-.0)  A A ER 

"S"  (-,0)  EH  S 

"T"  (-,0)  T lY 

"U"  (-,0)  Y UW 

"V"  (-.0)  V lY 

"W"  (-,0)  0 AX  B ((EH,0)  L.0)  Y UW 

"X"  (-,0)  EH  K S 

'T'  (-.0)  W AA  IH 

"Z"  (-.0)  S lY 

ZERO  (-.0)  5 ('.0)  (IKIY)  ER  OW 

ONE  (-,0)  W AH  N 

TWO  (-.0)  T (-.0)  IH  UW 

THREE  (-.0)  F (-,0)  ER  lY 

FOUR  (-.0)  F (-,0)  AO  ER 

FIVE  (-.0)F(-,0)AA{AX,IH)V 

SIX  (-.0)  S (-.0)  IH  (-.0)  K (-.0)  S 

SEVEN  (-.0)  S (-.0)  EH  V (EH.AX)  N 

EIGHT  (-.0)  EH  (IH.AX)  (-.0)  T 

NINE  (-,0)  N «A  IH  N 

[ 

] 


101 


C.5.  Ch«sc  l•ngu•g• 


C.5.1  Chess  -Syi.i’a* 

<BIGM0Vf>::-  [ <M0V:>  ] 

<M0VE>:»  <M0VE1><CHECK-W0RD> 

<M0VE1> 

<M0VE1>;:-  <REGULAR-M0VE> 

<CAPTURE> 

<CASTLE> 

<CASTLE>:;-  <CASTLE-WOP'^>ON<UNIROYAL>SIDE 

<CASTLE-W0R0><UNIR0YAL>SIDE 
<CASTLE-W0RD> 

<REGULAR-M0VE>::-  <PCE-LX-<MOVE-WORD><SQUARE> 

<PAWN-L0C><M0VE-W0RD><SQUARE38> 

<CAPTURE>;:-  <EP-PAWN><CAPTL)RE-WORD>PAWN  EN-PASSENT 

<PCE -LOCxCAPTURE -W0RD>  <CMAN-LX> 
■'.PAWN-LOCxCAP'^'  E-WORD><PMAN-LOC> 

<CASTLE-WORD>::-  CASTLE-S 

• OVE-WORD>:.-  TO 

MOVES-TO 

GOES-TO 

<CAPTU,<t-WORO>::-  TAKES 

CAPTURES 


<CHECK-WORO>::-  CHECK  MATE 
CHECK 

<EP-PAWN>::-  <EP-PAWN-LOC> 

<UN’ROYAL><EP-PAWN-LOC> 

<UI  ROYALxUNIPIECE><EP-PAWN-LOC> 
<U'  'PIECE><EP-PAWN-LOC> 

«EP-PAWN-LOC>::-  PAWN  ON  <UNIROYALxPltCE>  FIVE 
PAWN  ON  <NOPAWN>  FIVE 
PAWN 

<CMAN-LOC>::-  <CPCE-LOC> 

<PAWN-LOC> 


<PCE-LOC>;:- 

<PCE-SPEC>  ON  <SQUARE> 
<PCE-SPEC> 

<PCE-SPEC>::- 

<UN1R0YAL><P1ECE> 

<NOPAWN> 

<CPCE-LOC>::- 

<CPCE-SPEC>  ON  <SQUARE> 
<CPCE-SPEC> 

<CPCE-SPEC>:.- 

'•UN1R0YAL><P1ECE> 

<NOPNOK> 

<PAWN-LOC>::- 

<PAWN-SPEC>  ON  <SQUARE27> 
<PAWN-SPEC> 

<PAWN-SPEC>::- 

<UM1R0YAL><UN1P1ECE>PAWN 

<UNIROYAL>PAWN 

<UN1P!ECE>PAWN 

PAWN 

<PMAN-LX>::- 

<CPCE-SPi.C>  ON  <SQUARE38> 
<CPCE-SPEC> 

<PAWN-SPEC>  ON  <SQUARE37> 
<PAWN-SPEC> 

<SOUARE>::« 

<UNIROYAL><PirCt><RANK> 

<NOPAWN><RANK> 

<SQUARE27>:.- 

<UNIR0YAL><P1ECE><RANK27> 

<NOPAWN><RANK27> 

<SQUARE38>::- 

<UNIR0YAL><PIECE><RANK38> 

<NOPAWN><RANK38> 

<SQUARE37>::- 

<UNIR0YAL><P!ECE><RANK37> 

<NOPAWN><RANK37> 

<UNIROYAL>::» 

KING-S 

QUEEN-S 

<UN;PIECE>::- 

BISHOP-S 

KNIGHT-S 

ROOK-S 

r>i  srrM 

1 1 

BISHOP 

KNIGHT 

ROOK 

<NOPAWN>::- 

KING 

103 


<N0PN0K> 

<PieCE>i;- 

BISHOP 

K^'IGHT 

ROOK 

<RANK37>::- 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

<RANK27>::- 

<RANK37> 

TWO 

<RANK38>!:- 

<RANK37> 

EIGHT 

<RANK>::- 

<RANK38> 

ONE 

TWO 

104 


C.5.2  Chess  Dictionary 


BISHOP 

(-.0)  B (AX.IH)  SH  AX  P 

BISHOP-S 

(-,0)  B (AX.IH)  SH  AX  P (S,0) 

CAPTURES 

(-.0)  K AE  P (-.0)  T SH  ER  S 

CASTLE-S 

(-.0)  K AE  S (EH,0)  L S 

CHECK 

(-.0)  T SH  EH  K 

EIGHT 

(*,0)  EH  (IH,AX)  T 

EN-PASSENT 

(-,0)  AA  N P AA  S AA  N 

FIVE 

(-,0)  F AA  IH  V 

FOUR 

(-,0)  F OW  ER 

GOES-TO 

(-.0)  G OW  S T AX 

KING 

(-.0)  K IH  NX 

KING-S 

(-.0)  K IH  NX  (S,0) 

KNIGHT 

(-,0)  N AA  IH  T 

KNIGHT-S 

(-.0)  N AA  IH  T (S,0) 

MATE 

(-.0)  M EH  (IH.AX)  T 

MOVES  "TO 

(-,0)  MUW  V S T AX 

ON 

(-,0)  AA  N 

ONE 

(-.0)  W AH  N 

PAWN 

(-,0)  P AO  N 

QUEEN 

(-,0)  K W lY  N 

QUEEN'S 

(-.0)  K W lY  N (S,0) 

ROOK 

(-.0)  ER  UH  K 

ROOK-S 

(-.0)  ER  UH  K (S,0) 

SEVEN 

(-,0)  S EH  V AX  N 

SIDE 

(-.0)  S AA  IH  D 

SIX 

(-,0)  S IH  K S 

TAKES 

(-.0)  T EH  (IH,AX)  K S 

THREE 

(-.0)  F ER  lY 

TO 

(-.0)  T AX 

TWO 

(-,0)  T UW 

t 

1 


105 


C.fi  Lizard  lingual* 

C.6.1  Lizard  Syntax 
<UTT>-.;- 
<C0MMAND>::- 

<0P>:;- 

<SIGN-NUMBEP>:: 

<NUMBER>::« 

<DIGIT>;:- 


<NUMBER-Z>:;- 

<DlGIT-2>!:- 


[<C0MMAND>] 

<0P><S1GN-NUMBER> 

display 

ADD 

SUBTRACT 

WULTiPLY 

DIVIDE 

LOAD 

MINUS  <NUMBER> 
<NUMBER> 

<D1G1T> 

<DlGIT><NUMBER-2> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<DIGIT-2> 

<01GIT-2><NUMBER> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 


106 


C.6,2  Lizard  Dictiona'y 


ADD  (-.0)  (HH,0)  (AX,0)  AE  (- D,0) 

DISPLAY  (-.0)  D (IH.AX)  S - P L EH  (IH,0)  (AX,0) 

DIVIDE  (-.0)  D (IH.AX)  V (F,0)  AH  (IH,0) 

EIGHT  (-.0)  (HH,0)  (AX,0)  EH  (-  T,0) 

FIVE  (-,  (-.0))  F AH  <IH,0)  V 

FOUR  {-,  (-,0))  F AH  ER 

LOAD  (-.0)  L OW  (AX,0) 

MINUS  (-.0^  M AH  UKO)  (AX,0)  N IH  S 

MULTIPLY  (-.0)  M AA  (EH.O)  L (-.0)  T AX  ( (-,0),-)  P L AH  (IH,0)  (AX,0) 

NINE  (-.0)  N AH  (IH,0)  (AX,0)  N 

ONE  (-,0)WAHN 

SEVEN  (-.0)  S EH  V (AX,AX,0)  N 

SIX  (-.0)  S IH  ( -)  S 

SUBTRACT  (-.0)  S (AX.UH)  - T ER  AE  (- T,0) 

THREE  (-,  (-,0))  F ER  lY  (AX,0) 

TWO  (-,  (-.0))  T IH  UW 

ZERO  (-.0)  S (AX.O)  IH(ER,0)0W 

[ 

] 


107 


C.7.  Voic*  Prof  ramming  Lanfuaga 


C.7J  l/oic«  Programming  Syntax 

<REQUEST>::-  [ <C0MMAN0>  ] 

<C0MMAND>::-  <SET-W0RD>  <S1MPLE-EXPRE>  <IN-W0RD>  <VARIABLEOF> 

^VARIABLE>  <GET-W0RD>  <SIMPLE-EXPRF> 

<SK)W-W0RD>  <S1MPLE-lXPRF> 

<SET-W0RD>::»  STORE 

PUT 

<IN-W0RD>::-  IN 

:nto 

<GET-W0RD>::=  GETS 

BECOMES 

<SHOW-WORD>:.-  WHAT  IS 

SHOW 

<BIN-0PE>:;-  P!  US 

MINUS 
TIMES 
DIVIDE 
MOD 
POWER 
MAX 
MIN 

<UN-OPE>::-  NEGATE 

ABSOLUTE 
FACT 

<BIN-OPF>::-  PLUS 

MINUS 
TIMES 
DIVIDE 
K-tOD 
POWER 
MAX 
MIN 


<UN-0PF>:: 


NEGATE 

ABSOLUTE 

FACT 


<SIMPLE-E)(PRE>::-  <PRIMARYCE>  <BIN-OPE>  <PRIMARYDE> 
<UN-OPE>  <PRIMARYDE> 

<PRIMARYDE> 

<VARIABLECE>::-  ALPHA 

BETA 
GAMMA 
DELTA 
EPSILON 

<PRIMARYCE>::-  <RAOIXCE>  <INTEGERCE> 

<INTEGERCE> 

<VARIABLECE> 


<RAD1XCE>::- 

OCTAL 

DECIMAL 

<INTEGERCE>::- 

<DIGITACE>  <INTEGERCE2> 

<DIGITACE> 

<DIGITACE>::- 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<INTEGERCE2>::- 

<DIG!TACE2><INTEGERCE> 

<D1GITACE2> 

<DIGITACE2>::- 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<VARIABLEDE>::- 

ALPHA 

BETA 

GAMMA 

DELTA 

EPSILON 

109 


<VARIABLE>:!- 

<PRIMARYDE>::- 

<RAD1XDE>:.- 

<INTEGERDE>::= 

<D1G1TADE>::- 


<INTEGERDE2>::= 

<DIG1TADE2>::- 


<SIMPLE-EXPRF>::- 

<VARIABLECF>:t- 


ALPHA 

BETA 

GAMMA 

DELTA 

EPSILON 

<RAD1XDE>  <INTEGERDE> 

<1NTEGERDE> 

<VARIABLED£> 

OCTAL 

DECIMAL 

<D1G1TADE>  <INTEGERDE2> 

<D1G1TADE> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<DIGITADE2><INTEGERDE> 

<DIGITADE2> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<PRIMARYCF>  <BIN-OPF>  <PRIMARYDF> 
<UN-OPF>  <PRIMARYDF> 

<PRIMARYDF> 

ALPHA 

BETA 

GAMMA 

DELTA 

EPSILON 


<PR!MARYCr>:;- 

<RADIXCr>::- 

<INTEGERCr>::- 

<DIGITACF>;:- 


<INTEGERCE2>::- 

<DIGITACF2>::- 


<VARIA0LEDF>::- 

<PRIMARYDF>;:- 

<RADIXDF>::- 

<INTEGERDF>::- 


<RAOIXCF>  <INTEGERCF> 

<INTEGERt'F> 

<VAR1ABLECF> 

XTAI. 

DECIMAL 

<DIGITACF>  <INTEGERCF2> 
<DIGITACF> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<DIG1TACF2><INTEGERCF> 
<DlGi  fACF2> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

ALPHA 

BETA 

GAMMA 

DELTA 

EPSILON 

<RADIXrjF>  <INTEGERDF> 
<INTEG£RDF> 

<VARIABLEOF> 

octal 

DECIMAL 

<DIGITADF>  <INTEGERDF2> 
<DIG1TADF> 


<DIG1TADF>::- 


<INTEGER0F2>::- 

<D1GITADF2>::- 


7Er?0 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

<DIG1TADF2><1NTEGERDF> 

<D1G1TADF2> 

ZERO 

ONE 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 


112 


C.7.2  Voice  Programming  Dictionary 


ABSOLUTE 

(-.0)  (HH.O)  (AX,0)  AE  -)  S (AX.O)  L UW  (-  T,0) 

ALPHA 

(-.0)  (HKO)  AX  AE  (EH.O)  L { (-,0),0)  (F,0)  (AH) 

BECOMES 

(-,  (-.0))  (Q.HH)  (lY.IH)  -)  K AH  M S 

BETA 

(-.  (-.0))  (B.HH)  EH  (D.)  (T,0)  AH 

DECIMAL 

(-.0)  D EH  S M (EH.O)  L 

DELTA 

(-.0)  D EH  L ((,N)  ((-,0)  T,0),D)  AH 

DIVIDE 

(-.0)  D AX  V (-.0)  Y (AX,0)  ( (AX,HH,0),0) 

EIGHT 

(-.0)  (HH.O)  (AX,0)  EH  (-  T.O) 

EPSILON 

(-.0)  (HH,0)  (AX,0)  (EH, AX)  (-„  -)  S (AX,0)  L AO  N 

FACT 

(-,  (-.0),0)  F AE  (,-)(- T,0) 

FIVE 

(-.  (-.0),0)  F Y V 

FOUR 

(-,  (-,0),0)  F AH  ER 

GAMMA 

(-.0)  G AE  M AH 

GETS 

(-,0)  G IH  (AX,0)(  -)S 

IN 

(-.0)  (HH.O)  (AX.IH)  N 

INTO 

(*,0)  (HH.O)  (AX.1H)  N (.0)  (-.0)  T AX 

IS 

(*,0)  (HH.O)  (AX,0)  IH  (IY,AX,0)  (S  (S,0),(S,0)  S) 

MAX 

(-.0)  M AE  ( ,0)  - S 

MIN 

(-,0)MIHN 

MINUS 

(••,0)  M Y N AX  S 

MOD 

(-.0)  M AA 

NEGATE 

(-.0)  N (AX, EH)  (-.0)  G EH  (-  T,0) 

NINE 

(-.0)  N Y (AX.O)  N 

OCTAL 

(-.0)  AA  ( ,0)  - T (EH,0)  L 

ONE 

(-.0)  W AH  N 

PLUS 

(-,  (-.0))  P L AH  S 

POWER 

(-,  (-.0))  P AA  UH  ER 

PUT 

(-,  (-,0))PUH  (-T,0) 

SEVEN 

(',0)  S EH  V (AX,0)  N 

SHUW 

(-.0)  SH  AH  OW  (OW  (,0),0) 

SIX 

(-.0)  S IH  ( -)  S 

STORE 

(-,0)  S - T AH  ER 

THREE 

(-,  (-.0))  F (.0)  ER  lY 

TIMES 

(-,  (-.0))  T Y M S 

TWO 

(-,  (-.0))  T IH  UW 

WHAT 

(-.0)  (HH.O)  W AA  (-  T.0) 

7FPO 

(-.0)  S IH  ER  OW  (AX.O) 

1 


C.8.  IBM  "N«w  Rilsigh"  Languag* 


C,6,l  IBM  "Nnv/  Raleigh"  Syntax 

<S>  <BOXO>  <BOXOX> 

<BOXOX>  <B0X1>  ^B0X1X> 

<B0X2>  <B0X2X> 
<B0X3>  <B0X3X> 
<B0X4>  <^B0X4X> 
<80X1  X>  <B0X5>  <B0X5X> 

<B0X9>  <B0X9X> 
<B0X5X>  <B0X9>  <B0X9X> 

<B0X9X>  <B0X  1 3>  <B0X  1 3X> 
<B0X14>  <B0X14X> 
<aDX  1 3X>  <B0X2 1 > <80X2 1 X> 

<B0X14X>::-  <B0X24><B0X24X> 

<BOX25>  <a0X25X> 
<B0X2X>  <0)X6>  <B0X6X> 

<BOXIO>  <BOX10X> 
<B0X6X>  <BOX10>  <BOX10X> 

<BOX10X>  <B0X15>  <B0X15X> 

<B0X16>  <B0X16X> 
<R0X15X>;:-  <B0X2I>  <B0X21X> 

<B0X16X>::«  <BOX24><BOX24X> 

<B0X25>  <BOX25X> 
<B0X3X>  <80X7>  <B0X7X> 

<80X1 1>  <80X1 1X> 
<B0X7X>  <B0X1 1>  <B0X1 1X> 

<80X1 1X>  <80X1 7>  <B0X1/X> 

<B0X18>  <B0X18X> 
<B0X17X>  <B0X21>  <80X2 1X> 

<B0X18X>  <B0X24><B0X2flX> 

<B0X25>  <B0X25X> 
<B0X4X>  <B0X8>  <B0X8X> 

<B0X12>  <80X1 2X> 
<B0X8X>  <B0X12>  <B0X12X> 

<B0X12X>  <B0X19>  <B0X19X> 

<B0X20>  <B0X20X> 
<B0X  1 9X>  <B0X2 1 > <80X2 1 X> 

<BOX?OX  > <BOX24>  <B0X24X> 

<B0X25>  <B0X25X> 
<80X21  X>  <B0X22>  <B0X22X> 

<B0X23>  <BOX23X> 
<B0X22X>  <B0X2S>  <BOX28X> 

<BOX29>  <BOX29X> 
<B0X?3X>  <BOX26>  <BOX26X> 

<BOX27>  <BOX27X> 


''tllllllflfltllWHltlNlll 


114 


<BOX24X> 

<BOX26>  <BOX26X> 
<BOX27>  <BOX27X> 

<BOX25X> 

<BOX28>  <B0X2SX> 
' BOX29>  <BOX29X> 

<BOX26X-> 

<B0X30>  <B0X30X> 

<BOX30X> 

<B0X34>  <B0X34X> 

<BOX27X- 

<B0X31>  <B0X31X> 

<BOX31X> 

<B0X35>  <B0X3BX> 

<BOX28X> 

<OOX32>  <BOX32X> 

<B0X32X> 

<BOX36>  <BOX36X> 

<BOX29X> 

<BOX33>  <B0X33X> 

<B0X03X> 

<BOX37>  <BOX37X> 

<BOX34X 

> ..B 

<B0X38> 

<BOX35X> 

<B0X38> 

<BOX36X> 

<B0X38> 

<BOX37X> 

<B0X38> 

<BOXO> 

«s 

[ 

<80X1  > 

St 

ONE 

<B0X2> 

e 

EACH 

<B0X3> 

m 

SOME 

<B0X4> 

s 

SHOULD 

<80X5> 

BAD 

BLACK 

GFNTLE 

GRFAT 

PRIMARY 

PROFICIENT 

QUIET 

RFCOGNITION 

SMALL 

SUFFICIENT 

<BOX6>  DISTANT 

EAGER 
KIND 
LARGE 
NEW 
OTHER 
TINY 
TIRED 
TRUE 
UGLY 

<BOX7>  ACTIVE 

DEMOCRATIC 

FAIR 

LITTLE 

PRACTICAL 

POOR 

REAL 

SAFE 

SHORT 


<B0X8> 


<B0X9> 


<BOX10> 


<80X11  > 


<80X1 2> 


<80X1 3> 


<80X1 4> 


STRONG 

BACKWARD 

BIG 

CLOSE 

GOOD 

IMPORTANT 

OLD 

PASSIVE 

RUGGED 

SEPARATE 

USELESS 

CONDITION 

DURATION 

GENERAL 

PRIVATE 

SERGEANT 

TRAIN 

VILLAGE 

DIVISION 

PART 

PERIOD 

POWER 

TIME 

TOWN 

WAR 

MATTERS 

MEN 

PEOPLE 

PRACTICES 

STREETS 

TREATIES 

WORKERS 

ACTIONS 

BASES 

BATTLES 

COMMANDS 

FORMS 

GROUNDS 

PLACES 

CONSIDERED 

CREATED 

GAVE 

LIKED 

MADE 

MOVED 

PERMITTED 

WANTED 

CHANGES 

DOES 

FIGHTS 


<B0X15>  :: 


<B0X16>  :: 


<B0X17> 


<B0X18> 


<B0X19> 


<BOX20> 


FEELS 

GOES 

LIVES 

PROPOSES 

VOTES 

CONTRIBUTED 

CRITICIZED 

DISTURBED 

FORGOT 

GOVERNED 

HAD 

SHOWED 

TOOK 

APPEARS 

APPROVES 

DRINKS 

HAS 

IS 

LOOKS 

TAKES 

WORKS 

ACCEPTED 

APPLIED 

BROUGHT 

DETECTED 

FOUND 

OUTLAWED 

REJECTED 

SAVED 

ASK 

GET 

Kr'OW 

MAKE 

PAY 

RAN 

SURVIVE 

WERE 

BE 

CALL 

CARRY 

CONTROL 

HAVE 

THINK 

TRY 

TURN 

BELIEVE 

COME 

DO 

DIRECT 

FOLLOW 


117 


PROCEED 

SEEM 

STAND 

<80X21  > THE 

<K)X22>  BUILDING 

CAPTAIN 
CmUSE 
CITY 
COUNTRY 
LETTER 
MAJOR 
MAN 
NATION 
OFFICER 
REPORT 
THOUGHT 

<B0X23>  BUS 

CAMPAIGN 

FOOD 

GUN 

MOTION 

NAME 

RADIO 

SHIP 

STATE 

TELEPHONE 

THING 

WEAPON 

<0)X24>  !!-  AGAIN 

EXCESSIVELY 

LEAST 

MAJORLY 

MERELY 

MOSTLY 

NOT 

ONLY 

PRINCIPALLY 

PROPERLY 

SOMETIMES 

TRULY 

<K)X25>  ALWAYS 
FINALLY 
FREQUENTLY 
Li'SS 
MORE 
NEVER 

OCCASIONALLY 


I 


<BOX26> 

SELDOMLY 

USUALLY 

ACROSS 

<BOX27> 

AT 

FROM 

ON 

TOWARD 

UNDER 

AGAINST 

<BOX28> 

FOR 

IN 

INTO 

THROUGH 

TO 

AROUND 

<BOX29> 

BEFORE 

DURING 

OVER 

PAST 

WITH 

ABOUT 

<B0X30> 

AFTER 

AMONG 

BETWEEN 

BY 

WITHOUT 

THOSE 

<B0X31> 

THE 

<BOX32> 

THE 

<B0X33> 

THOSE 

<D0X34> 

APPROACHES 

<B0X35> 

ENGINEERS 

GIRLS 

ISSUES 

LOCATIONS 

OPERATIONS 

PLANS 

PROBLEMS 

SITES 

ZONES 

AIRPLANE 

BUSINESS 

ENGINE 

MACHINE 

MISSILE 

MOMENT 

ORDER 

PRODUCT 

USE 

YEAR 

118 


<B0X3S> 


<BOX37> 


<B0X38> 


CAPITOL 

CONCERN 

COVER 

DAY 

INTERVAL 

LIFE 

PURPOSE 

SACRIFICE 

VEHICLE 

WEEK 

CAMPS 

FIELDS 

HOUSES 

INTERESTS 

METHODS 

SCIENTISTS 

SERVICES 

SOLDIERS 

SYSTEMS 

TECHNIQUES 

] 


120 


C.8.2  IBM  "New  Raleigh"  Dictionary 


ABOUT 

(-,0)  AH  B AA  AX  T 

ACCEPTED 

(-.0)  IH  K S EH  P T (-.0)  IH  D 

ACROSS 

(-,0)  AH  K ER  AA  UH  S 

ACTIONS 

(-.0)  AE  K SH  AH  N (-,0)  S 

ACTIVE 

(-.0)  AE  K T IH  V 

AFTER 

(-.0)  AE  F T ER 

AGAIN 

(-.0)  AH  G EH  N 

AGAINST 

(-.0)  AH  G EH  N S T 

AIRPLANE 

(-,0)  EH  AX  ER  (-.0)  P L EH  (IH.AX)  N 

ALWAYS 

(-.0)  AA  UH  L W EH  (IH.AX)  S 

AMONG 

(-.0)  AH  M AH  NX 

APPEARS 

(-.0)  AH  P EH  (IH,AX)  AX  ER  (-,0)  S 

APPLIED 

(-.0)  AH  P L AA  AX  (-.0)  D 

APPROACHES 

(-.0)  AH  P ER  OW  (-.0)  T SH  (-.0)  IH  S 

APPROVES 

(-,0)  AH  P ER  UW  V (-.0)  S 

AROUND 

(-.0)  AH  ER  AA  AX  N D 

ASK 

(-.0)  AE  S K 

AT 

(-.0)  AE  T 

BACKWARD 

(-.0)  B AE  K W ER  D 

BAD 

(-.0)  B AE  D 

BASES 

(-.0)  B EH  (IH,AX)  S (-.0)  IH  S 

BATTLES 

(-.0)  B AE  T AH  L (-.0)  S 

BE 

(-.0)  B EH  (IH.AX) 

BEFORE 

(-.0)  B EH  (IH.AX)  F OW  AX  ER 

BELIEVE 

(-,C)  B EH  (IH.AX)  L EH  (IH.AX)  V 

BETWE'^N 

(-.0)  B EH  (IH,AX)  T W EH  (IH,AX)  N 

BIG 

(-.0)  B IH  G 

BLACK 

(-.0)  B L AE  K 

BROUGHT 

(-.0)  B ER  AA  UH  T 

BUILDING 

(-.0)  B IH  L D IH  NX 

BUS 

(-.0)  B AH  S 

BUSINESS 

(-.0)  B IH  S N IH  S 

BY 

(-.0)  B AA  AX 

CALL 

(-.0)  K AA  UH  L 

CAMPAIGN 

(-.0)  K AE  M P EH  (IH.AX)  N 

CAMPS 

(-.0)  K AE  M P (-.0)  S 

CAPITOL 

(-,0)  K AE  P IH  T AH  L 

CAPTAIN 

(-,0)  K AE  P T IH  N 

CARRY 

(-.0)  K AE  ER  EH  (IH,AX) 

CAUSE 

(-.0)  K AA  UH  S 

CHANGES 

(-.0)  (-,0)  T SH  EH  OH, AX)  N (-,0)  D SH 

CITY 

(-.0)  S IH  T EH  (IH.AX) 

CLOSE 

(-.0)  K L OW  S 

COME 

(-,0)  K AH  M 

COMMANDS 

(-,0)  K AH  M AE  N D (-,0)  S 

CONCERN 

(-.0)  K AH  N SI  ER  N 

liWRirifStp* 


121 


CONDITION 

(-.0)  K AH  N D IH  SH  AH  N 

CONSIDERED 

(-.0)  K AH  N S IH  D ER  (-,0)  D 

CONTRIBUTED 

(-.0)  K AH  N T ER  IH  B Y UW  T (-,0)  IH  D 

CONTROL 

(-.0)  K AH  N T ER  OW  L 

COUNTRY 

{-,0)  K AH  N T ER  EH  (IH,AX) 

COVER 

(-.0)  K AH  V ER 

CREATED 

(-.0)  K ER  EH  (1H.AX)  EH  (IH.AX)  T (-.0)  IH  D 

CRITICIZED 

(-.0)  K ER  IH  T IH  S AA  AX  S (-,0)  D 

DAY 

(-.0)  D EH  (IH,AX) 

DEMOCRATIC 

(-.0)  D EH  M AH  K ER  AE  T IH  K 

DETECTED 

(-.0)  D EH  (IH.AX)  T EH  K T (-.0)  IH  D 

DIRECT 

(-,0)  D IH  ER  EH  K T 

DISTANT 

(-,0)  D IH  S T AH  N T 

DISTURBED 

(-.0)  D IH  S T ER  B (-,0)  D 

DIVISION 

(-.0)  D IH  V IH  SH  AH  N 

DO 

(~,0)  D UW 

DOES 

(-,0)D  AHS 

DRINKS 

(-,0)  DER  IH  NX  K (-,0)  S 

DURATION 

(-,0)  D Y UW  ER  EH  (IH,AX)  SH  AH  N 

DURING 

(-.0)  0 Y UW  ER  IH  NX 

EACH 

{-.0)  EH  (IH.AX)  (-.0)  T SH 

EAGER 

(-.0)  EH  (IH,AX)  G ER 

ENGINE 

(-.0)  EHN(-,0)DSH  IHN 

ENGINEERS 

(-,0)  EH  N (-,0)  D SH  IH  N EH  (IH,AX)  AX  ER  (-,0)  S 

EXCESSIVELY 

(-,0)  EH  K S EH  S IH  V (-,0)  L EH  (IH.AX) 

FAIR 

(-.0)  F EH  AX  ER 

FEELS 

(-.0)  F EH  (IH.AX)  L (-,0)  S 

FIELDS 

(-,0)  F EH  (IH.AX)  L D (-.0)  S 

fights 

'-,0)  F AA  AX  T (-,0)  S 

FINALLY 

(-.0)  F AA  AX  N AH  L (-.0)  EH  (IH.AX) 

FOLLOW 

(-.0)  F AA  L OW 

FOOD 

(-.0)  F UW  D 

FOR 

(-,0)  F AA  UH  AX  ER 

FORGOT 

(-.0)  F AA  UH  ER  G AA  T 

FORMS 

(-,0)  F AA  UH  AX  ER  M (-.0)  S 

FOUND 

(-.0)  F AA  AX  N D 

FREQUENTLY 

(-.0)  F ER  EH  (IM,AX)  K W AH  N T (-.0)  L EH  (IH^X) 

FROM 

(-.0)  F ER  AH  M 

GAVE 

(-,0)  G EH  (IKAX)  V 

GENERAL 

(-,0)(-,0)DSH  EHNER  AH  L 

GENTLE 

(-.0)  (-.0)  D SH  EH  N T AH  L 

GET 

(-.0)  G EH  T 

GIRLS 

(-.0)  G ER  L (-.0)  S 

GOES 

(-.0)  G OW  (-.0)  S 

GOOD 

(-.0)  G UH  0 

GOVERNED 

(-.0)  G AH  V ER  N (-,0)  D 

GREAT 

(-.0)  G ER  EH  (IH.AX)  T 

GROUNDS 

(-.0)  G ER  AA  AX  N D (-,0)  S 

GUN 

(-,0)  G AH  N 

HAD 

(-.0)  HH  AE  D 

122 


HAS 

(-.0)  HH  AE  S 

HAVE 

(-,0)  HH  AE  V 

HOUSES 

(-,0)HH  AA  AX  S(-,0)  IH  S 

IMPORTANT 

(-.0)  IHMPAA  UH  AXERT  AHNT 

IN 

(-.0)  IH  N 

INTERESTS 

(-.OHHNTER  EHST(-,0)S 

INTERVAL 

(-.0)  IH  N T ER  V AH  L 

INTO 

(-.0)  IH  N(-,0)  T UW 

IS 

(-.0)  IH  S 

ISSUES 

(-.01  IH  SHYUW(-,0)  S 

KIND 

(-.0)  K AA  AX  N 0 

KNOW 

(-,0)N0W 

LARGE 

(-.0)  L AA  AX  ER  (-.0)  D SH 

LEAST 

(-,0)L  EH(IH,AX)  S T 

LESS 

(-.0)  L EH  S 

LETTER 

(-.0)  L EH  T ER 

LIFE 

(-.0)  L AA  AX  F 

LIKED 

(-.0)  L AA  AX  K (-.0)  T 

LITTLE 

(-.0)  L IH  T AH  L 

LIVES 

(-.0)  L IH  V (-.0)  S 

LOCATIONS 

(-.0)  L OW  K EH  (IH.AX)  SH  AH  N (-.0)  S 

LOOKS 

(-,0)  L UH  K (-.0)  S 

MACHINE 

(-,0)  MAH  SH  EH(IH,AX)N 

MADE 

(-.0)  M EH(IH.AX)D 

MAJOR 

(-,0)  M EH(IH,AX)  (-.0)  D SH  ER 

MAJORLY 

(-.0)  M EH  (IH.AX)  (-.0)  D SH  ER  (-.0)  L EH  (IH.AX) 

MAKE 

(-,0)  M EH  UH,AX)  K 

MAN 

(-,0)  M AE  N 

MATTERS 

(-,0)  M AE  T ER(-,0)  S 

MEN 

(-,0)  M EH  N 

MERELY 

(-,C)  M EH  (IH.AX)  AX  ER  (-.0)  L EH  (IH.AX) 

METHODS 

(-.0)  M EH  F AH  D (-,0)  S 

MISSILE 

(-,0)  M IH  S IH  L 

MOMENT 

(-.OIMOWMEHNT 

MORE 

(-,0)M0WAXER 

MOSTLY 

(-,0)MOWST(-,0)LEH(IH,AX) 

MOTION 

(-,0)  M OW  SH  AH  N 

MOVED 

(*,0)  M UW  V (-.0)  D 

NAME 

(-.0)  N EH  (IH.AX)  M 

NATION 

(-,0)NEH(IH,AX)  SH  AH  N 

NEVER 

(-.0)  N EH  V ER 

NEW 

(-,0)  N Y UW 

NOT 

(-.0)  N AA  T 

OCCASIONALLY 

(-.0)  Al  K EH  (IH.AX)  SH  AH  N AH  L (-.0)  EH  (IH,AX) 

OFFICER 

(-.0)  AA  UH  F IH  S ER 

OFTEN 

{-,0)  AA  UH  F AH  N 

OLD 

(-,0)  OW  L D 

ON 

(-,0)  AA  N 

ONCE 

(-.0/  W AH  N S 

ONE 

(-.0)  W AH  N 

ONLY 

(-,0)0WNL  EH(IM,AX) 

OPERATIONS 

(-.0)  AA  P AH  ER  EH  (IH.AX)  SH  AH  N <-,0)  S 

ORDER 

(-,0)  AA  UH  AX  ER  0 ER 

OTHER 

(-.0)  AH  DH  ER 

OUTLAWED 

(-.0)  AA  AX  T L AA  UH  (-.0)  0 

OVER 

(-.0)  OW  V ER 

PART 

(-,0)  P AA  AX  FR  T 

PASSIVE 

(-,0)  P AE  S IHV 

PAST 

(-,0)PAEST 

PAY 

(-.0)  P EH  (IH.AX) 

PEOPLE 

(-.0)  P EH  (IH,AX)  P AH  L 

PERIOD 

(-.0)  P EH  (IH,AX)  ER  EH  (IH.AX)  IH  D 

PERMITTED 

(-.0)  P ER  M IH  T (-.0)  IH  D 

PLACES 

(-.0)  P L EH  UH,AX)  S (-,0)  IH  S 

PLANS 

(-,0)PLAEN(-,0)S 

POOR 

(-,0)  P UW  AX  ER 

POWER 

(-,0)  P AA  AX  AX  ER 

PRACTICAL 

(-.0)  P ER  AE  K T IH  K AH  L 

PRACTICES 

(-.0)  P ER  AE  K T IH  S (-,0)  IH  S 

PRIMARY 

(-.0)  P ER  AA  AX  M EH  ER  EH  (IH^X) 

PRINCIPALLY 

(-,0)  P ER  IH  N S IH  P (-,0)  L EH  (IH.AX) 

PRIVATE 

(-.0)  P ER  AA  AX  V IH  T 

PROBLEMS 

(-,0)PER  AA  B L AH  M (-,0)  S 

PROCEED 

(-.0)  P ER  OW  S EH  UHAX)  D 

PRODUCT 

(-.0)  P ER  AA  0 AH  K T 

PROFICIENT 

(-0)  P ER  OW  F IH  SH  AH  N T 

PROPERLY 

(-.0)  P ER  AA  P ER  (-,0)  L EH  UKAX) 

PROPOSES 

(-.0)  P ER  OW  P OW  S (-,0)  IH  S 

PURPOSE 

(-.0)  P ER  P AH  S 

QUIET 

(-.0)  K W AA  AX  IH  T 

RADIO 

(-.0)  ER  EH  UHAX)  D EH  UHAX)  OW 

RAN 

(-,0)  ER  AE  N 

RARELY 

(-.0)  ER  EH  AX  ER  <-,0)  L EH  UHAX) 

REAL 

(-.0)  ER  EH  UHAX)  L 

RECOGNITION 

(-,0)ER  EHK  IHGNIH  SH  AH  N 

REJECTED 

(-.0)  ER  EH  UHAX)  (-,0)  D SH  EH  K T (-,0)  IH  D 

REPORT 

(-.0)  ER  EH  UHAX)  P OW  AX  ER  T 

RUGGED 

(-.0)  ER  AH  G IH  D 

SACRIFICE 

(-.0)  S AE  K ER  IH  F AA  AX  S 

SAFE 

(-.0)  S EH  <IH,AX)  F 

SAVED 

(-,0)SEH<IHAX)V(-,0)D 

SCIENTISTS 

(-.0)  S AA  AX  IH  N T IH  S T (-,0)  S 

SEEM 

(-,0)  S EH  UHAX)  M 

SELDOMLY 

(-.0)  S EH  L D AH  M (-,0)  L EH  (IH^X) 

SEPARATE 

(-.0)  S EH  P ER  IH  T 

SERGEANT 

(-.0)  S AA  AX  ER  (-,0)  D SH  AH  N T 

SERVICES 

(-.0)  S ER  V IH  S (-.0/  IH  S 

SHIP 

(-.0)  SH  IH  P 

SHORT 

{-,0)  SH  AA  UH  AX  ER  T 

SHOULD 

{-,0)  SH  UH  D 

SHOWED 

(-.0)  SH  OW  (-.0)  0 

SITES 

(-.0)  S AA  AX  T (-.0)  S 

SMALL 

(-.0)  S M AA  UH  L 

SOLDIERS 

(-.0)  S OW  L (-.0)  D SH  ER  (-.0'  S 

SOME 

(-.0)  S AH  M 

SOMETIMES 

(-.0)  S AH  M T AA  AX  M S 

STAND 

(-.0)  S T AE  ND 

STATE 

(-.0)  S T EH  (IH.AX)  T 

STREETS 

(-.0)  S T ER  EH  (IH.AX)  T (-,0)  S 

STRONG 

(-.0)  S T ER  AA  UH  NX 

surnciENT 

(-.0)  S AH  F IH  SH  AH  N T 

SURVIVE 

(-.0)  S ER  V AA  AX  V 

SYSTEMS 

(-.0)  S IH  S T IH  M (-.0)  S 

TAKES 

(-.0)  T EH  (IH.AX)  K (-.0)  S 

TECHNIQUES 

(-.0)  T EH  K N EH  (1H,AX)  K (-.0)  S 

TELEPHONE 

(-.0)  T EH  L IH  F OW  N 

THE 

(-.0)  DH  AH 

THING 

(-.0)  F IHNX 

THINK 

(-.0)  F IH  NX  K 

THOSE 

(-.0)  DH  OW  S 

THOUGHT 

(-.0)  F AA  UH  T 

THROUGH 

(-.0)  F ER  IH  AX 

TIME 

(-,0)  T AA  AX  M 

TINY 

(-.0)  T AA  AX  N EH  (IH,AX) 

TIRED 

(-.0)  T AA  AX  AX  ER  (-.0)  D 

TO 

(-.0)  T LW 

TOOK 

(-.0)  T UH  K 

TOWARD 

(-.0)  T AH  W AA  UH  AX  ER  D 

TOWN 

(-,0)  T AA  AX  N 

TRAIN 

(-,0)  T ER  EH  (IH.AX)  N 

TREATIES 

(-.0)  T ER  EH  (!H,AX)  T EH  (1H.AX)  (-.0)  S 

TRUE 

(-.0)  T ER  IH  AX 

TRULY 

(-.0)  T ER  IH  AX  (-.0)  L EH  (IKAX) 

TRY 

(-.0)  T ER  AA  AX 

TURN 

(-.0)  T ER  N 

UGLY 

(-.0)  AH  G L EH  <IH,AX) 

UNDER 

(-.0)  AH  N D ER 

USE 

(-.0)  Y UW  S 

USELESS 

(-.0)  Y UW  S (-.0)  1 EH  S 

USUALLY 

(-.0)  Y UW  SH  UW  AH  L (-.0)  EH  (IKAX) 

VEHICLE 

(-.0)  V EH  (IH.AX)  HH  IH  K AH  L 

VILLAGE 

(-.0)  V IH  UH  (-.0)  D SH 

VOTES 

(-,0)V<’7^T(-,0)S 

WANTED 

(-,u)  V AA  r: ' ( ,0)  IH  D 

WAR 

(-,0)  W AA  UH  AX  ER 

WEAPON 

(-.0)  W EH  P AH  N 

WEEK 

(-.0)  W EH  (IH.AX)  K 

WERE 

(-.0)  W ER 

WITH 

(-.0)  W IH  DH 

WITHOUT 

(-.0)  W IH  DH  AA  AX  T 

125 


WORKERS 

WORKS 

YEAR 

ZONES 

[ 

1 


(-.0)  W ER  K (-.0)  ER  (-,0)  S 
(-.0)  W ER  K (-.0)  S 
(-.0)  Y EH  (IH.AX)  AX  ER 
(-.0)  S OW  N (-.0)  S 


126 


C.9.  LLBAS:  Lincoln  Lab  "Basic"  Languag* 


C.9.1  LLBAS:  Lincoln  Lab  "BaiLc”  Syntax 

<SENT>  [ <$$>  ] 

<SS>  <DiS> 

<CON> 

<CLR> 

<GO> 

<DFL> 

<SK> 

<MOV> 

'COMP> 

<GET> 

<PIC> 

<WRITE> 

<pur> 

<LIST> 

<OUTP> 

<SET> 


\ <DIS> 

<OISPV> 

<OISOBJ> 

'OISOBJI> 


<DUDW> 

<DW> 

<DET> 

<DU> 

<OISCLS> 

^MtAS^ 


<OISPV>  <DISOBJ> 

<OISPV>  <0!S0BJ>  <DISWH> 

DISPLAY 

REDISPLAY 

SHOW-Mt 

THE  <DISOBJI> 

ALL  MATCHES 

ALI  MATCHES  <DW> 

'DISCLS> 

<DISCLS>  <DUDW> 
FORMANTS 
FORMANTS  <DU> 

FORMANT  <PAR> 

FORMANT  <PAR>  <DU> 

<DU> 

<DW> 

OF  <DET>  <UTT> 

THE 

THIS 

FOR  THE  <DISWRD> 

<PAR> 

<MEAS>  <PAR> 

<LABS>  LABELS 
<DATFOR> 

AVERAGE 


127 


<PAR> 

<PAR1> 

<LABS> 
<OATFOR> 
<DATFOni>  :: 

<DISWRD> 

<DISMOD> 

<PHONS> 

<VOIC> 

<POS> 

<FR1C> 


MAXIMUM 

MINIMUM 

total 

'PAR1> 

FIRST  MOMENT 

AMPLITUDE 

PITCH 

FREQUENCY 

GRAPH 

ENERGY 

ZEROCROSSING-DENSITY 

EDITED 

PHONEMIC 

HAND 

<DATFORl> 

CONFUSION  MATRIX 
EVENT  ARRAY  <S> 
ENVELOPE  <S> 
SPECTROGRAM  <S> 
WAVEFORM  <S> 
FORMANTS 
SPECTRUM 
SPECTRA 
SEGMENTATION 
<PHONS> 

<DISMOO>  <PHONS> 
<OISMOD>  WORD 
<LEN> 

<ORD> 

<vov/> 

<Pf;3>  <VOW> 

<STOP> 

<VOIC>  <STOP> 

<NAS> 

<FRIC> 

<V01C>  <FR1C> 
SONORANT  <S> 
CONSONANT  <S> 
DIPHTHONG  <S> 

VOICED 

UNVOICED 

VOICELESS 

FRONT 

BACK 

HIGH 

LOW 

MID 

FRICATIVE  <S> 
AFFRICATE  <S> 

STOP  <S> 


<STOP>  :: 


128 


<VOW> 

PLOSIVE  <S> 
VOWEL  <S> 

<NAS> 

NASAL  <S> 

<LEN> 

LIQUID  <S> 
GLIDE  <S> 
LONGEST 

<ORD>  ::•= 

SHORTEST 

EIRST 

<DISWH> 

SECOND 
THIRD 
FOURTH 
ON  THE  <SCO> 

<scr  > ;•= 

<Dir,OEV> 

<'SCTYPE> 

<SCTYPE>  <DISDEV> 
HUGHES 

<DISDEV> 

REFRESH 

SCOPE 

<UTT> 

DISPLAY 
SCREEN 
ENTRY  <S> 

<CON>  ;;  = 

UTTERANCE  <S> 

SENTENCE  <S> 

SLOT  <S> 

FILE  <S> 

<-COr'V>  <UNIT>  <DIGIT>  TO  <COND> 

<COm>  ::= 

<CONV>  TAPE  <UNIT>  <DIGIT>  TO  <COND> 
CONNECT 

<COND> 

ASSIGN 

<DE.TM>  <TERM> 

<DETM> 

<TERM>  <DIGIT> 

<TERM>  NUMBER  <DIGIT> 
THE 

<TERM> 

THIS 

MY 

CONSOLE 

<DIGIT> 

TERMINAL 

ONE 

<UNiT> 

TWO 

THREE 

FOUR 

FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

UNIT 

<CLR> 

<CLRV> 

<CLRO> 

<PLOT> 

<GOV> 

<MODL> 

<DEL> 

<DELV> 

<DELO> 

<DELOD> 

<QUANT> 

<QUANT1>  !!- 
<SPAN> 

<SPANI> 

<CARD> 

<TEN5> 


<CLRV>  THE  <CLRO> 

CLEAR 

ERASE 


<SCO> 

<PLOT>  OF  THE  <DATFOR> 
PLOT 

GRAPH  <S> 

DISPLAY 


<GOV>  THE  <MODE>  MODE 

GO-INTO 

SWITCH-TO 

SEARCH 

GRAPHICS 

DISPLAY 


<DELV>  <DELO> 

DROP 

DELETE 

<QUANT>  <DATFOR> 

<QUANT>  <DATFOR>  <DELOD> 
<QUANT>  <LABS>  LABELS 
<QUANT>  <LABS>  LABELS  <DELOD> 
THOSE  <SPAN>  <CARD>  <TIME> 
FROM  THE  <DATDEV> 

<QUANT1> 

ALL 

ALL  THE 

THE 

THIS 

<SPAN1> 

LONGER  THAN 
GREATER  THAN 
SHORTER  THAN 
LESS  THAN 
OVER 
UNDER 
<DIGIT> 

<DIGIT>  <HUNDREDS> 

<TENS> 

<TENS>  <DIGIT> 

<TEENS> 

TWENTY 

THIRTY 

FORTY 

FIFTY 

SIXTY 

SEVENTY 


130 


<7EENS> 


<HUNDREDS>  :: 
<T1ME> 

<SK> 

<SKV> 

<SKV0> 

<SKVT> 

'SEQ> 

<SEQ1> 

<M0V> 

<M0VE> 

<VKDV0> 

<M0V02> 


<M0V03> 


EIGHTY 
NINETY 
TEN 
ELEVEN 
TWELVE 
THIRTEEN 
FOLJRTEEN 
FIFTEEN 
SIXTEEN 
SEVENTEEN 
EIGHTEEN 
Nir-TEEN 
•=  HUNDRED 

HLJNDREO  <0IG1T> 

SECONDS 

MILLISECONDS 


<SKV>  <SKVO> 

SKIP 

SKIP-OVER 
THE  <SEQ>  <UTT> 

THE  <SEQ>  <UTT'  <SKVT> 

TO  THE  <SEQ>  <UTT> 

TO  THE  <SEQ>  <UTT>  <SKVT> 
ON  UNIT  <DIGIT> 

ON  TAPE  UNIT  <DIGIT> 
<SEQ1' 

<ORD> 

NEXT 

CURRENT 

INITIAL 

LAST 


<MOVE>  <MOVO> 

MOVE 

THE  «^M0V02> 

UNIT  <CARD>  <MOV03> 

TAPE  UNIT  <CARD>  <M0VC3> 

<MARK>  TO  THE  <SEQ>  <SEG> 

<MARK>  TO  THE  <SEQ>  <VOIC>  <SEG> 

<SIDE>  <MARK>  TO  THE  <SEQ>  <SEG> 

<SIDE>  <MARK>  TO  THE  <SEQ>  <VOIC>  <SEG> 
<MARK>  <DIR>  <CARD>  <TIME> 

<S1DE>  <MARK>  <DIR>  <CARD>  <TIME> 
tape  <DIR>  <CARD>  <UTT> 

TAPE  TO  <SEQ>  <UTT> 

TO  THE  <SEQ>  <UTT> 

<DIR>  <CARD>  <UTT> 


131 


<SIDE> 

<MARK> 

<SEG> 

<DIR> 

<COMP> 

<COMPV> 

<COMPO> 

<C0MP2> 

<C0MP3> 

<SHAPE>  :!- 
<LEV> 

<GET> 

<GETV> 

<GETV1> 

<GETO> 

^GET01> 

<GET02> 


RIGHT 

LEFT 

BOUNDARY 

CURSOR 

FRAME 

SEGMENT 

FORWARD 

BACKWARD 


<COMPV>  THE  <COMPO> 

COMPUTE 
CALCULATE 
RECOMPUTE 
RECALCULATE 
<MEAS>  <PAR> 

<MEAS>  <PAR>  <COMP2> 

DISTRIBUTION  OF  THE  <PHONS> 

EFFECT  OF  <SHAPE>  THE  <LEV>  TO  <CARD> 
IN  <DET>  <COMP3> 

IN  <DET>  <SEQ>  <COMP3> 

<PHONS> 

<SEG> 

<VOIC>  <SEG> 

PUTTING 

SETTING 

INCREASING 

REDUCING 

THRESHOLD 

LEVEL 

GAIN 


<GETV>  <GETO' 

<GETV1> 

SEARCH  FOR 

FIND 

GET 

RETRIEVE 
GET-ME 
GIVE-ME 
TRY-TO-FIND 
<QUANT>  <GETOl> 

THE  <GET02> 

<UTT>  <CARD> 

<UTT>  INFORMATION 
<DISCLS>  FROM  THE  <DATDEV> 
RANGE  OF  THE  <ORD>  FORMANT 
RANGE  OF  THE  <PAR> 

<UTT>  <START>  WITH  <COM> 


132 


<TENS>  KILOHERTZ  WAVEFORM  <S> 
<SEO>  <UTT>  <PREPIN>  <DATDEV> 
<PHONS> 

<PHONS>  <INSEN> 

<SEO>  <PHONS> 

<SEO>  <PHONS>  <INSEN> 

<SEG> 

<SEG>  <INSEN> 

<VOIC>  <SEG> 

<V01C>  <SEG>  <INSEN> 

<SEO>  <SEG> 

<SEO>  <SEG>  <INGEN' 

<SE0>  <VOIC>  <SEG> 

<SEQ>  <VOIC>  <SEG>  <INSEN> 
<PHONS>  FOR  <GET2> 

<PREPIN>  ON 

IN  THE 

<INSEN>  IN  <GET2> 

<GET2>  <UTT>  <CARD> 

<UTT> 

<SEO>  <L!TT> 


<DATDEV> 

THE  <L|tt> 
the  <SEQ>  <utt> 
<DATDEV1> 

<DATDEV1> 

DATA  BASE 
TAPE 

<START> 

DRUM 

DISK 

COMPUTER 

BEGINNING 

STARTING 

RETRIEVE 

DELETE 

DISPLAY 

REDISPLAY 


<PIC> 

<P!CKV--'  <QUANT>  <PHONS>  <PICO> 

<PICKV> 

PICKOUT 

SELECT 

<PICO>  .•:= 

IN  THE  <SEQ>  <UTT> 

WITH  THE  <LIM>  ENERGY 

ONLY  FROM  <UT1>  LIST  <CARD> 

WITH  <ORDER>  STRESS 

*-LIM> 

LEAST 

MOST 

HIGHEST 

LOWEST 

<ORDER> 

PRIMARY 

SECONDARY 

TERTIARY 


<WRITE> 

<WRITV> 

<WRITO> 

<WRITD> 

<WRIT2> 

<COMPA> 

<PUT> 

<PUTV> 

<PUTO> 

<LIST> 

<LISTV> 

<LISTO> 

<PRDEV> 

<OUTP> 

<OUTPUT> 

<OUTPUTO> 


<SET> 

<SETV> 


<WR1TV>  <WRITO> 

<WR1TV>  <WRITO>  <WRITD> 

WRITE 

STORE 

SAVE 

<0UANT>  <WRIT2> 

EVERYTHING 
THE  <SE0>  <UTT> 

ONTO  <DATDEV1> 

IN  THE  <DATDEV> 

<DISCLS> 

FORMANT  <PAR> 

<COMPA>  VALUE  <S> 

<COMPA>  FIELD  <S> 

<TENS>  KILOHERTZ  WAVEFORM 

COMPUTED 

RECOMPUTED 

CALCULATED 

RECALCULATED 


<PUTV>  <PUTO> 

PUT 

<WRITO> 

<WRITO>  <WRITD> 

THE  <SIDE>  <MARK>  ON  THE  <ORD>  <SEG> 


<LISTV>  <QUANT>  <PHONS> 

<LiSTV>  <QUANT>  <PHONS>  <LISTO> 

LIST 

PRINT 

ON  THE  <PRDEV> 

FROM  <UTT>  <CARD> 

XEROX 

SCOPE 


<OUTPUT>  THE  <OUTPUTO> 
OUTPUT 

VECTOR  OF  <UTT>  NAMES 
<MEAS>  ENERGY  IN  THE  BAND 


<SETV>  TT€  <SETO> 

SET 

RESET 


134 


<SETO> 

BATCH  <TAG>  TO  <CARD> 
DEFAULT  SPEAKER  TO  <I0> 
DEFAULT  FOR  SEX  TO  <SEX> 
DEFAULT  FOR  SITE  TO  <SITE> 
COLUMN  <DIM>  TO  <CARD> 
INCREMENT  TO  <CARD> 

<1D> 

<INIT> 

<NAME> 

<INIT> 

JA 

RW 

SM 

CW 

<NAME> 

ALLEN 

WIESEN 

MCCANDLESS 

WEINSTEIN 

<DIM> 

WIDTH 

HEIGHT 

<SEX> 

MALE 

FEMALE 

<SITE> 

LL 

BBN 

SRI 

SDC 

CMU 

<TAG> 

CODE 

TAG 

<S>  ::*» 

S 

I 


L 


C.9.2  LLBAS:  Lincoln  Lab  “Basic"  Dictionary 


AFFRICATE 

(-.0)  AEFERdH  ,0)K  EHT 

ALL 

(-.0)  AO  (L  ,0) 

ALLEN 

(-.0)  AE  L (EH  ,0)  N 

AMPLITUDE 

(-.0)  AEMPL(IH,0)TUWD 

ARRAY 

(-.0)  (AH  ,0)  ER  EH  (IH.AX) 

ASSIGN 

(-,0)  AH  S AA  IH  N 

AVERAGE 

(-,C)  AE  V ER  IH  (-.0)  D SH 

BACK 

(-.0)  B AE  K 

BACKWARD 

(*,0)  B AE  K W ER  D 

BAND 

(-.0)  B AE  ND 

BASE 

(*,0)  B EH(IH,AX)  S 

BATCH 

(*,0)  B AE  (-.0)  T SH 

BBN 

(-.0)  B lY  B lY  EH  N 

BEGINNING 

(-,0)B  IHGIHNIHNX 

BOUNDARY 

(-,0)  B AA  UH  N D (AX  ,0)  ER  lY 

CALCULATE 

(-.0)  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH.AX)  T 

CALCULATED 

(-.0)  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH.AX)  T AX  D 

CLEAR 

(-.0)  K L IH  ER 

CMU 

(-.0)  S lY  EH  M Y UW 

CODE 

(-.0)  K OW  D 

COLUMN 

(-.0)  K AA  L (AH  ,0)  M 

COMPUTE 

(-.0)  K (AH  ,0)  M P (Y  ,0)  UW  T 

COMPUTED 

(-.0)  K AX  M P (Y  .0)  UW  T AX  D 

COMPUTER 

(-.0)  K AX  M P (Y  ,0)  UW  T ER 

CONFUSION 

(-.0)  K AX  N F Y UW  SH  AX  N 

CONNECT 

(-.0)  K (AH  ,0)  N EH  K T 

CONSOLE 

(-,0/K  AANS(EH,0)L 

CONSONANT 

(-.0)  K AA  N S (AH  ,0)  N AH  N T 

CURRENT 

(-.0)  K AH  ER  (AX  ,0)  N T 

CURSOR 

(-.0)  K ER  S ER 

CW 

(-.0)  S lY  D AH  B (EH,0)  L Y UW 

DATA 

(-.0)  D EH(HAX)  D AH 

DEFAULT 

(-.0)  D IH  F AO  (L  ,0)  T 

DELETE 

(-.0)  D (AH  ,0)  L lY  T 

DIPHTHONG 

(-.0)  D IH  F F AA  NX 

DISK 

(-.0)  D IH  S K 

DISPLAY 

(-,0)D(IH,0)S  PLEH(IH,AX) 

DISTRIBUTION 

(-.0)  D IH  S T (ER  ,0)  0 Y UW  SH  (AX  ,0)  N 

DROP 

(-.0)  D (ER  ,0)  AA  P 

DRUM 

(-.0)  D (ER  ,0)  AH  M 

EDITED 

(-.0)  EH  D (AX  ,0)  D EH  D 

EFFECT 

(-.0)  lY  F EH  K T 

EIGHT 

(-.0)  EH  (IH.AX)  T 

EIGHTEEN 

(-.0)  EH  (IH.AX)  T lY  N 

EIGHTY 

(-.0)  EH  (IH.AX)  D lY 

ELEVEN 

(-.0)  lY  L EH  V AX  N 

1 


ENERGY 

(-,0)EHNER(-.0)DSH  lY 

ENTRIES 

(-.0)  EH  N (T  ,0)ER  lY  S 

ENTRY 

(-.0)  EH  N(T  ,0)  ER  lY 

ENVELOPE 

(-.0)  AH  N V (AX  ,0)  L OW  P 

ERASE 

(-.0)  lY  ER  EH(IH,AX)S 

EVENT 

(-.0)  lY  V EH  N T 

EVERYTHING 

(-.0)  EH  V ER  lY  F IH  NX 

FEMALE 

(-,0)  F lY  M EH  (IH,AX)  L 

FIELD 

(-.0)  F lY  L D 

FIFTEEN 

(-,0)  F IH  F T lY  N 

FIFTY 

(-.0)  F IH  F T lY 

FILE 

(-.0)  F AA  IH  L 

FIND 

(-.0)  F Y N D 

FIRST 

(-.0)  F ER  S (T  ,0) 

FIVE 

(-0)  F AA  IH  V 

FOR 

'-,0)  F (ER  ,0)  (AO  ,0) 

FORMANT 

(-.0)  F AO  (ER  ,0)  M AH  N T 

FORMANTS 

(-.0)  F AO  (ER  ,0)  M AH  N T S 

FORTY 

(-.0)  F AO  T lY 

FORWARD 

(-.0)  F AO  (ER  ,0)  W ER  D 

FOUR 

(-.0)  F AO 

FOURTEEN 

(-.0)  K AO  T lY  N 

FOURTH 

(-.0)  F AO  F 

FRAME 

(-.0)  F ER  EH  (IH,AX)  M 

FREQUENCY 

(-.0)  F ER  lY  K W EH  N S lY 

FRICATIVE 

(-.0)  F ER  IH  K (IH  ,0)  0 IH  V 

FROM 

(-.0)  F (ER  ,0)  AH  M 

FRONT 

(-.0)  F ER  AH  N T 

GAIN 

( -,0)  G EH  (IHAX)  N 

GET 

(-.0)  G EH  T 

GET-ME 

(-.0)  G EH  (T  ,0)  M lY 

GIVE-ME 

(-.0)  G IH  (V  ,0)  M lY 

GUDE 

(-.0)  G L AA  IH  D 

GO- INTO 

(-.0)  G OW  (W  ,0)  IH  N T UW 

GRAPH 

(-.0)  G ER  AE  F 

GRAPHICS 

(-.0)  G ER  AE  F IH  K S 

GREATER 

(-.0)  G (S  ,0)  ER  EH  (IHAX)  0 ER 

HAND 

(-,0)  HH  AE  N D 

HEIGHT 

(-,0)  HH  AA  IH  T 

HIGH 

(-.0)  HH  AA  IH 

HIGHEST 

(-,0)  HH  AA  IH  S T 

HUGHES 

(-,0)  HH  Y UW  S 

HUNDRED 

(-.0)  HH  AH  N D ER  IH  D 

IN 

(-,0)  IH  N 

INCREASING 

(-.0)  IH  N K ER  lY  S IH  NX 

INCREMENT 

(-.0)  IH  N K ER  M EH  N T 

INFORMATION 

(-.0)  IH  N F ER  M EH  (IHAX)  SH  (AX  ,0)  N 

INITIAL 

(-.0)  IH  N IH  SH  (AX  ,0)  L 

JA 

(-.0)  D SH  EH  (IHAX)  (0  ,0)  EH  (IHAX) 

KILOHERTZ 

(-,0)  K (S  ,0)  IH  L OW  HH  ER  T S 

137 


LABELS 

(-.0)  L EH  {IH.AX)  B (EKO)  L S 

LAST 

(-.0)  L AE  S T 

LEAST 

(-.0)  L lY  S T 

LEFT 

(-.0)  L EH  F T 

LESS 

(-.0)  L EH  S 

LEVEL 

(-.0)  L EH  V (EH.O)  L 

LIQUID 

(-.0)  L IH  K W IH  D 

LIST 

(-.0)  L IH  S T 

LL 

(-.0)  EH  L EH  L 

LONGER 

(-.0)  L AO  N G ER 

LONGEST 

(-.0)  L AO  N G !H  S 

LOW 

(-.0)  L OW 

LOWEST 

(-.0)  L OW  <W  ,0)  IH  S T 

MALE 

(-.0)  M EH  {IH.AX)  L 

MATCHES 

(-.0)  M AE  (-,0)  T SH  AX  S 

MATRIX 

(-.0)  M EH  (IH,AX)  T ER  IH  K S 

MAXIMUM 

(-.0)  M AE  K S (AX  ,0)  M AH  M 

MCCANDLESS 

(-.0)  M IH  K AE  N D L IH  S 

MID 

(-.0)  M IH  D 

MILLISECONDS 

(-.0)  M AH  L (AH  ,0)  S EH  K (AX  ,0)  N (T  ,0)  S 

MINIMUM 

(-,0)  M IH  N (AX  ,0)  M AX  M 

MODE 

(-.0)  M OW  D 

MOMENT 

(-,0)MOWMEHNT 

MO.'T 

(-,0)MOWS  T 

MOVE 

(-.0)  M UW  V 

MY 

(-.0)MAA  IH 

NAMES 

(-.0)  N EH  (IH,AX)  M S 

NASAL 

(-.0)  N EH  (IH.AX)  S (EH,0)  L 

NEXT 

(-.0)  N EH  K S T 

NINE 

(-.0)  N AA  IH  N 

NINETEEN 

(-.0)  N AA  IH  N T lY  N 

NINETY 

(-.0)  N AA  IHN  D lY 

NUMBER 

(-,0)  N AH  M B ER 

OF 

(-.0)  (AH  .0)  V 

ON 

i-  0)  (AA  ,0) 

ONE 

(-.0)  WAHN 

ONLY 

Ct.OIOWNL  lY 

ONTO 

{-.Oz-AA  lY  N T Y (UW  ,0) 

OUTPUT 

(••,0)  AA'UH  g ,0)  P UH  T 

OVER 

(-.0)  OW  V ER 

PHONEMIC 

(-.0)  F OW  N lY  M IH  K 

PICKOUT 

(-.0)  P IH  K AA  UH  T ' 

PITCH 

(-.0)  P IH  (-.0)  r SH 

PLOSIVE 

(-,0)  P L OW  S IH  V 

PLOT 

(-,0)  P L AA  T 

PRIMARY 

(-.0)  P ER  AA  IH  M (AX  ,0)  (ER  ,0)  lY 

PRINT 

(-.0)  P ER  IH  N T 

PUT 

(-,0)  P UH  T 

PUTTING 

(-.0)  P UH  D IH  NX 

RANGE 

(-,0)  ER  EH  (IH,AX)  N (-.0)  D SH 

rfcalculate 

recalculated 

recompute 

RECOMPUTED 

REDISPLAY 

REDUCING 

refresh 

RESET 

retrieve 

RIGHT 

RW 

SAVE 

SCOPE 

SCREEN 

SDC 

SEARCH 

SECOND 

secondary 

SECONDS 

SEGMENT 

segmentation 

SELECT 

SEMIVOWEL 

SENTENCE 

SET 

SETTING 

SEVEN 


(-.0)  ER  lY  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH.AX)  T 
(-,0)  ER  lY  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH.AX)  T AX  D 
(^0)  ER  lY  K (AX  ,0)  M P (Y  ,0)  UW  T 
(-.0)  ER  lY  K (AX  ,0)  M P (Y  ,0)  UW  T AX  D 
(-.0)  ER  lY  D (IH  .0)  S P L EH  (IH,AX) 

(-.0)  ER  lY  D (Y  .0)  UW  S IH  NX 
(-.0)  ER  lY  E ER  EH  SH 
(-.0)  ER  lY  S EH  T 
(-.0)  ER  lY  T (S  ,0)  ER  lY  V 
(-,0)  ER  AA  IH  T 

(-.0)  AH  ER  D AH  B (EH,0)  L Y UW 

(-,0)  S EH  (IH,AX)  V 

(-.0)  S K OW  P 

(-.0)  S K ER  lY  N 

(-.0)  EH  S D lY  S lY 

(-.0)  S ER(-,0)  T SH 

(-.0)  S EH  K AX  N D 

(-.0)  S EH  K (AX  ,0)  N D EH  (ER  ,0)  (lY  ,0) 

(-.0)  S EH  K AX  N S 
(-.0)  S EH  G M EH  N T 

(-.0)  S EH  G M (EH  ,0)  EH  (IH,AX)  N T EH  (IH.AX)  SH  AX  N 

(-.0)  S (AH  ,0)  EH  K T 

(-,0)  S EH  M lY  V AA  UH  L 

(-,0)  S EH  N (^  AX  N ,0)  S 

(-,0)  S EHT 

(-.0)  S EH  D IH  NX 

(-,0)S  EHVEHN 


SEVENTEEN 

SEVENTY 

SEX 

SHORTER 

SHORTEST 

SHOW-ME 

SITE 

SIX 

SIXTEEN 

SIXTY 

SKIP 

SKIP-OVER 

SLOT 

SM 

SONORANT 

SPEAKER 

SPECTRA 

SPECTROGRAM 

SPECTRUM 

SRI 

STARTING 

STOP 

STORE 


(-,0)  S EH  V (EH  ,0)  N T lY  N 

(-.0)  S EH  V AX  N D lY 

(-.0)  S EH  K S 

( -,0)  SH  AO  (ER  ,0)  T ER 

(-.0)  SH  AO  (ER  ,0)  T AX  S T 

(-.0)  SH  OWM  lY 

(-.0)  S AA  IHT 

(-.0)  S IH  K S 

(-.0)  S IH  K S T lY  N 

(-,0)  S IH  K S D lY 

(-,0)  S K IHP(S,0) 

(-.0)  S K IH  P OW  V AH 

(-.0)  S L AA  T 

(-,0)CHSEHM 

(-.0)  S OW  L N ER  (AX  ,0)  N T 

(-.0)  S P lY  K ER 

(-.0)  S P EH  (K  ,0)  T ER 

(-.0)  S P EH(K  ,0)  T ER  G ER  AE  M 

(-.0)  S P EH  (K  ,0)  T ER  AH  M 

(-.0)  EH  S AH  ER  AA  IH 

(-.0)  S T AA  (ER  ,0)  D IH  NX 

(-.0)  S T AA  P 

v-,0)  S T AO 


139 


STRESS 

(-.0)  S T (ER  ,0)  EH  S 

SWITCH-TO 

(-.0)  S W IH  (-,0)  T SH  T Y UW 

TAG 

(-.0)  T AE  G 

TAPE 

(-.0)  T (S  .0)  EH  (IH,AX)  P 

TEN 

(-.0)  T EH  N 

TERMINAL 

(-.0)  T ER  M (IH  ,0)  N (EH.O)  L 

tertiary 

(-.0)  T ER  SH  (AX  ,0)  ER  lY 

THAN 

(-.0)  DH  AE  N 

THE 

(-.0)  DH  (AH  ,0) 

THIRD 

(-.0)  F ER  D 

thirteen 

(-.0)  F ER  T lY  N 

THIRTY 

(-.0)  F ER  0 lY 

THIS 

(-,0)  DH  IH  S 

THOSE 

(-.0)  OH  OW  S 

THREE 

(-.0)  F ER  lY 

THRESHOLD 

(-.0)  F ER  EH  SH  (EH,0)  L D 

TO 

(-.0)  T Y (UW  ,0) 

TOTAL 

(-.0)  T OW  D (EH,0)  L 

try-to-find 

(-.0)  T ER  AA  IH  T AH  F AA  IH  N D 

TWELVE 

(-.0)  T W EH  L V 

TWENTY 

(-.0)  T W EH  N T lY 

TWO 

(-.0)  T Y UW 

UNDER 

(-.0)  AH  N D ER 

UNIT 

(-.0)  Y UW  N IH  T 

UNVOICED 

(-.0)  AH  N V AO  IH  S T 

UTTERANCE 

(-.0)  AH  D ER  (EH  ,0)  N S 

VALUE 

(-.0)  V (AE  ,0)  L Y UW 

VECTOR 

(-.0)  V EH  (K  .0)  T ER 

VOICED 

(-.0)  V AO  IH  S T 

VOICELESS 

(-.0)  V AO  IH  S L IH  S 

'.'OWEL 

(-,0>  V AA  UH  L 

WAVEFORM 

(-.0)  W EH  (IH,AX)  V F ER  M 

WEINSTEIN 

(-.0)  W AA  IH  N S T AA  IH  N 

WIDTH 

(-,0)  W IH  D F 

WIESEN 

(-.0)  W lY  S AX  N 

WITH 

(-.0)  W IH  F 

WORD 

(-.0)  W ER  D 

WRITE 

(-.0)  ER  AA  IH  T 

XEROX 

(-.0)  S IH  ER  AA  K S 

ZEROCROSSING-DENSITY 

(-,0)  S IH  ER  OW  K ER  AA  S IH  NX  D 

[ 

- 

] 

C IO.  UEXT:  Lincoln  Lib's  "Exiondod' 


Lancuagt 


C.iO.I  LL£XT:  Lincoln  Lab's  "ExtendtiT  Syntax 


<SENT> 

<SEN> 

<BEGG> 


^SS> 


[ <SEN>  ) 
<BEGG>  <SS> 
<SS> 

PLEASE 

NOW 

WELL  NOW 
NOW  PLEASE 
<DIS> 

<CON> 

<CLR> 

<GO' 

<DEL> 

<SK> 

<MOV> 

<COMP> 

<GET> 

<P1C> 

<WRITE> 

<PUT> 

<LIST> 

<SET> 

<MOOSEG> 

<MODOIS> 

<CHNG> 

<QUEST> 


<DIS> 

<DISPV> 


<0IS0BJ> 

<DiSOBJl> 


<DISPV>  <DISOB> 

<DISPV>  <OISOBJ>  <DISWH> 
DISPLAY 

redisplay 

SHOW-ME 

PUT-UP 

EDIT 

l-WANT-TO-SEE 

lets-see 

the  <DISO0JI> 

ALL  MATCHES 
ALL  MATCHES  <OW> 
<DISCLS> 

<DISCLS>  <OUDW> 

<LABS>  LABELS 
<LABS>  LABELS  <OU> 


<DUDW> 

FORMANTS 
FORMANTS  <DU> 
FORMANT  <PAR2> 
FORMANT  <PAR2>  <DU> 
<DU> 

<DW> 

<1:  V> 

OF  <DET>  <UTT> 

<DET> 

THE 

<DU> 

THIS 

FOR  <D1SWRD> 

<C'iGCLS> 

<PAR> 

<MEAS> 

<MEAS>  <PAR> 
<LABS>  LABELS 
<DATFOR> 

<PLOT>  OF  THE  <PAR> 
AVERAGE 

<PAR> 

MAXIMUM 

MINIMUM 

TOTAL 

MAXIMA 

MINIMA 

DISTRIBUTION-OF 

RANGE-Cr 

<PAR2> 

<PAR2> 

FIRST  MOMENT 
SPECTRAL  SLICES 
AMPLITUDE 

<LABS> 

PITCH 

FREQUENCY 

ENERGY 

DURATION 

EDITED 

<DATFOR> 

PHONEMIC 

HAND 

<DATFORl> 

<DATF0R3> 

<DATFOR2> 
<DATFOR3> 
ENVELOPE  <S> 

<DATF0R2> 

SPECTROGRAM  <S> 

<SPEC>  SPECTROGRAM  <S> 
WAVEFORM  <S> 

SPECTRUM 

SPECTRA 

SEGMENTATION 

CONFUSION-MATRIX 

<DATF0R1> 

EVENT-ARRAY  <S> 
PARSE-TREE  <S> 
MEAN-VALUES 
FORMANTS 
ENTRY-INFORMATION 

i I ' iJ  uiitiiHIHilllMlir'illHilllli. 


142 


<SPEC> 

• '»ISWRO> 

-DISMOD> 
<PHONS>  "= 


<VOIC> 

<POS> 

<FRIC>  :;= 
<STOP>  ::■= 

<VCW> 
<NAS>  ::= 

^L0NS> 


ZEROCRObSING  DENSITY 
PfIONEK/lC-TRANSCRIPTION  «-S> 

lexical-transcription  <S> 

7LPCCKOSSINGS 

u.^!N  TABLE  <S> 

ItOMOMORPHIC 
PREDICTIVE -COOING 
THE  <PHONS> 

,HE  <DISMOD>  <PHONS> 

THE  <DISMOD>  WORD 

THOSE  <PHONS>  <SPAN>  <CARD>  <TIME> 

<PHONS>  AND  <PHONS> 

<LEN> 

<CRD> 

<VOW' 

‘POS>  <VOW> 

<STOP> 

<V01C>  <STOP> 

<NAS> 

<EPIC> 

<VUC>  <ERIC> 

SONORANT  <S> 

CONSONANT  <S> 

<CONS>  CONSONANT  <S> 

DIPHTHONG  <S> 

SllEfCE 

TRANSITION 

VOICING 

VOICED 

UNVOICED 

VOICELESS 

ERONT 

BACK 

HIGH 

LOW 

MID 

FRICATIVE  <S> 

AFFRICATE  <S> 

STOP  <S> 

PLOSIVE  <S> 

BURST  <S> 

aspirate  <S> 

VOWEL  <S> 

SEMIVOWEL  <S> 

NASAL  <S> 

LIQUID  <S> 

GLIDE  <S> 

GLOTTAL 

INTERVXALIC 

LABIAL 


<LEN> 

<ORD> 


<DISWH> 

<SCO> 


<SCTYPE> 

<DISDEV> 


<UTT> 


LONGEST 

SHORTEST 

FIRST 

SECOND 

THIRD 

FOURTH 

FIFTH 

SIXTH 

SEVENTH 

EIGHTH 

NINTH 

TENTH 

ON  THE  <SCO> 
<DISOEV> 

<SCTYPE>  <DISDEV> 

SCAN-CONVERTOR 

HUGHES 

refresh 

SCOPE 

DISPLAY 

screen 

ENTRY  <S> 

utterance  <S> 

SENTENCE  <S> 

SLOT  <S> 

FILE  <S> 

DATA 

STATEMENT  <S> 
SAMPLE  <S> 
EXAMPLE  <S> 


143 


^CON^  :;<* 
<CONV> 


<COND> 


<DETM> 


<TERM> 

<DIGIT> 


<CONV>  <UNIT>  <DIGIT> 

<CONV>  <UNIT>  <DIGIT>  <COND> 

CONNECT 

ASSIGN 

LOAD 

REWIND 

TO  <DETM>  <TERM> 

TO  <TERM>  <DIGIT> 

TO  <TERM>  NUMBER  <D!GIT> 

THE 

THIS 

MY 

CONSOLE 

TERMINAL 

ONE 

TWO 

THREE 

FOUR 


lPPWJWfiff»rii'i»wwi 


<UNIT> 


<CLR> 

<CLRV> 


<CLRO> 

<PLOT> 

<G0> 

<GOV> 

<MODE> 

<DEL> 

<DELV> 


<DELO> 


<DELOD> 


FIVE 

SIX 

SEVEN 

EIGHT 

NINE 

TAPE-UNIT 

UNIT 


<CLRV>  THE  <CLRO> 

CLEAR 

ERASE 

CLEAN 

INITIALIZE 

REINITIALIZE 

REDO 

REFRESH 

<SCO> 

<PLOT>  OF  THE  <PAR> 
<DATFOR> 

<PLOT>  OF  THE  <DATFOR> 
PLOT  <S> 

GRAPH  <S> 

FUNCTIONS 
LINE  <S> 


<GOV>  THE  <MODE>  MODE 

GO-INTO 

SWITCH-TO 

SET-TO 

SEARCH 

GRAPHICS 

DISPLA'' 

INPUT 


<OELV>  <DELO> 

DROP 

DELETE 

FORGET 

REMOVE 

SCRATCH 

THROW-AWAY 

<OUANT>  <DATF0R> 

<QUANT>  <DATF0R>  <DEL0D> 
<QUANT>  <LABS>  LABELS 

<Ouant>  <labs>  labels  <DEL0D> 

THOSE  <SPAf^l>  <CARD>  <TIME> 
FROM  THE  <DATDEV> 


<QUANT> 

<QUANT1> 

ALL 

ALL  THE 

<QUANT1> 

THE 

THIS 

<SPAN> 

<SPAN1> 

LONGER  THAN 
GREATER  THAN 
SHORTEN  THAN 
LESS  THAN 

BETWEEN  <CARD>  AND 
LOWER  THAN 
HIGHER  THAN 

<SPAN1> 

OVER 

UNDER 

ABOVE 

BELOW 

<CARD> 

<D1G1T> 

<D1GIT>  <HUNDREDS> 
<TENS> 

<TENS>  <DIG1T> 
<TEENS> 

<TENS> 

TWENTY 

THIRTY 

FORTY 

FIFTY 

SIXTY 

SEVENTY 

EIGHTY 

NINETY 

<TEENS> 

TEN 

ELEVEN 

TWELVE 

THIRTEEN 

FOURTEEN 

FIFTEEN 

SIXTEEN 

SEVENTEEN 

EIGHTEEN 

NINETEEN 

<HUNDREDS>  ; 

HUNDRED 

HUNDRED  <D1G1T> 
HUNDRED  <TEENS> 
HUNDRED  <TENS> 
HUNDRED  <TENS>  <D1GIT> 

<TIME> 

SECONDS 

MILLISECONDS 

<SK>  :: 


<SKV>  <SKVO> 


<SKV> 


<SKV1> 


<SKVO>  ::•» 


<SKVT> 


<SEQ> 


SKIP 

SKIP  <DR>  TO 
SKIP-OVER 
SKIP -OVER  TO 
MOVE  TO 
MOVE  <DIR>  TO 
CONTINUE  TO 
CONTINUE  <D1R>  TO 
SEARCH  FOR 
SEARCH  <DIR>  FOR 
READ 
READ  TO 
READ  <DIR> 

READ  <DIR>  TO 
GO  TO 

GO  <DIR>  TO 
PROCEED  TO 
PROCEED  <D1R>  TO 
<SKV1> 

SKIP-TO 

FIND 

RETRIEVE 

SHIFT-TO 

GET 

GET-ME 

GIVE-ME 

TRY-TO  FIND 

PICKOUT 

SELECT 

GO  ON-TO 

I-WAN'"-TO*SEE 

I -WANT -ONLY 

I -ONLY- WANT 

LET-ME-SEE 

LETS-SEE 

PARSE 

READ-IN 

RETURN-TO 

<SEQ>  <UTT> 

<SEQ>  <UTT>  <SKVT> 

<UTT>  <CARD> 

<UTT>  <CARD>  ^SPKR> 

ON  UNIT  <DIG!T> 

ON  TAPE  UNIT  <DIGIT> 
FROM  THE  <DATDEV> 
<WITH>  <DETA>  <PHONSEG> 
THE  <SEQ1> 

THE  <ORD> 

ANOTHER 

THE 


<SEQ1> 


<SPKR> 


<nETA> 

<BY> 


NEXT 

CURRENT 

INITIAL 

LAt^T 

FINAL 

BEGINNING 

ENDING 

BRIEF 

OTHER 

PREVIOUS 

PROBLEM 

<BY>  <NAME> 

<BY>  <INIT> 

<BY>  A <SEX>  SPEAKER 

^BY>  SPEAKER  NUMBER  <DIGIT> 

A 

THE 

BY 

SPOKEN-BY 


<MOV> 

<MOVE> 

<MOVO> 


<MOV02> 


<M0V03> 

<SIDE> 


<MARK> 


> MOVE>  <MOVO> 

MOVE 

SHIFT 

THE  <M0V02> 

UNIT  <CARD>  <M0V03> 

TAPE  UNIT  <CARD>  <M0V03> 

<MARK>  TO  <SEQ>  <SEG> 

<MARK>  TO  <SEQ>  <V01C>  <SEG> 

<SIDE>  <MARK>  TO  <SEQ>  <SEG> 

<SIDE>  <MARK>  TO  <SEQ>  <V01C>  <SEG> 
<MARK>  TO  <SEQ>  <PHONS>  <SEG> 

<SIDE>  <MARK>  TO  <SEQ>  <PHONS>  <SEG> 
<MARK>  TO  THE  <PHONS>  <SEG> 

<SIDE>  <MARK>  TO  THE  <PHONS>  <SEG> 
<MARK>  <DIR>  <CARD>  <TIME> 

<SIDE>  <MARK>  <D1R>  <CARO>  <TIME> 
TAPE  <DIR>  <CARD>  <UTT> 

TAPE  TO  <SEQ>  <UTT> 

TO  THE  <ORC>  <UTT> 

<DIR>  <CARD>  <UTT> 

RIGHT 

LEFT 

PREVIOUS 

NEXT 

BOUNDARY 
CURSOR 
MARKER  <S> 

LABEL  <S> 

DESCRIPTOR 


<SEG> 


<DIR> 


<COMP> 

<COMPV> 


<COMPO> 

<C0MP2> 

<C0MP3> 

<C0MP4>  !!• 
<SHAPE> 

<LEV> 


POINTER 
POINT 
FRAME  <$> 
SEGMENT  <S> 
EXAMPLE  <S> 
EVENT  <S> 
OCCURANCE  <S> 
PHONEME  <S> 
SECTION  <S> 
PHRASE  <S> 
FORWARD 
BACKWARD 

earlier 

LATER 

ALONG 

AHEAD 

BACK 


•'COMPV>  THE  <COMPO> 

COMPUTE 

CALCULATE 

RECOMPUTE 

RECALCULATE 

DO 

REDO 

AVERAGE 

NORMALIZE 

<MFAS>  <PAR> 

<MEAS>  <PAR>  <COMP2> 

<DATFOR2>  <COMPA> 

DISTRIBUTION  OF  THE  <PHONS> 

EFFECT  OF  <SHAPE>  THE  <LEV>  TO  <CARO> 
IN  <COMP3> 

IN  <SEQ>  <COMP3> 

<PHONS> 

<SEG> 

<VOIC>  <SEG> 

FOR  <DET>  <UTT> 

<COMP2> 

PUTTING 

setting 

INCREASING 

REDUCING 

THRESHOLD 

LEVEL 

GAIN 

CUTOFF 


<GET> 

<GETV> 

<GETV1> 


<GETO> 


<GET01> 


<GET02>  :> 


<PREPIN> 

<INSEN> 


<GET2> 


<GETV>  <GETO> 

<GETV1> 

SEARCH  FOR 

FIND 

GET 

RETRIEVE 

GET-ME 

GIVE-ME 

try-to-find 

PICKOUT 

SELECT 

I-WANT-TO-SEE 
I -ONLY-WANT 
I -WANT -ONLY 
LET-ME-SEE 
LETS-SEE 

<0UANTG1>  <GET01> 

<GET02> 

<SE0>  <SEG> 

<UTT>  INFORMATION 
<DISCLS>  <WITH>  A <PHONSEG> 
<DISCLS>  FOR  <GET2> 

<DISCLS>  FROM  THE  <DATDEV> 

THE  RANGE  OF  THE  <ORD>  FORMANT 
THE  RANGE  OF  THE  <PAR> 

THE  <UTT>  <START>  WITH  <COM> 

THE  <TENS>  KILOHERTZ  WAVEFORM  <S> 
<SEQ>  <UTT>  <PREPIN>  <DATDEV> 

THE  <PHONS> 

THE  <PHONS>  ONLY  <1NSEN> 

<SEQ>  <PHONS> 

<SEQ>  <PHONS>  <INSEN> 

<SEQ>  <SEG> 

<SEQ>  <SEG>  <1NSEN> 

<SEQ>  <PHONS>  <SEG> 

<SEQ>  <PHONS>  <SEG>  <INSEN> 

THE  <SEG> 

THE  <SEG>  <INSEN> 

THE  <PHONS>  <SEG> 

THE  <PHONS>  <SEG>  <INSEN> 

THE  <PHONS>  FOR  <GET2> 

ON 

IN-THE 
IN  <GET2> 

FROM  <GET2> 

<UTT>  <CARD> 

<UTT> 

<SEQ>  <UTT> 

THE  <UTT> 

THE  <SEQ>  <UTT> 


149 


<DATDEV> 

<QUANT2>  <UTT>  <SPKR> 
'UTT>  <LISTG>  <CARO> 
<[)ATDEV1> 

<DATDEVl> 

<DAT0EV2> 

TAPE 

<DATDEV2> 

DRUM 

DISK 

DATA-BASE 

<START> 

COMPUTER 

BEGINNING 

<COM> 

STARTING 

RETRIEVE 

<QUANTG1^ 

DELETE 

DISPLAY 

REDISPLAY 

ALL 

<QUANT2> 

ALL  THE 

THE 

EACH 

<LISTG> 

EVERY 

NUMBER 

<PIC> 

LIST 

<PICKV>  <QUANT>  <PHONS>  <PICO> 

<PICKV> 

PICKOUT 

<PICO> 

SELECT 
FIND 
LOCATE 
SHOW-ME 
IN  <SEO>  <UTT> 

<LIM> 

WITH  THE  <L1M>  ENERGY 
ONLY  FROM  <UTT>  LIST  <CARD> 
WITH  <OROER>  STRESS 
LEAST 

<ORDER> 

MOST 

HIGHEST 

LOWEST 

PRIMARY 

<WRITE> 

SECONDARY 

TERTIARY 

<WRITV>  <WRITO> 

<WRJTV> 

<WRITV>  <WRITO>  <WRITO> 
WRITE 

STORE 

SAVE 

PUT 

KEEP 

151 


<WRITO> 

<WR1TD> 

<WRIT2> 

<COMPA> 

<CH01CE> 

<THIS> 

<PUT> 

<PUTV> 


<PUTO> 

<PUTWHERE> 

<L1ST> 

<L1STV> 


INSERT 

ADD 

<QUANT>  <WRIT2> 

EVERYTHING 
<SEQ>  <UTT> 

ONTO  <DATDEV1> 

ONTO  THE  <DATDEV1> 

IN  THE  <DATDEV2> 

ON  THE  <D1SDEV> 

INTO  T!*P  <OATDEV2> 

ON  <DATDEVi> 

ON  THE  <DATDEV1> 

<D1SCLS> 

FORMANT  <PAR> 

<COMPA>  VALUE  <S> 

<CM01CE>  FROM  <THIS>  ANALYSIS 
<COMPA>  FIELD  <S> 

<TENS>  KILOHERTZ  WAVEFORM 

COMPUTED 

RECOMPUTED 

CALCULATED 

RECALCULATED 

NORMALIZED 

CHOICE 

RESULT 

PROBABILITIES 

PERCENTAGES 

THIS 

THESE 


<PUTV>  <PUTO> 

PUT 

INSERT 

ADD 

POSITION 

MOVE 

SHIFT 

SLIDE 

<QUANT>  <DISCLS>  ‘•PUTWHERE> 

THE  <SIDE>  <MARK>  ON  THE  <ORD>  <SEG> 
<SPANI>  THE  <DISCLS> 

HERE 

THERE 


<LISTV>  <LISTWHT> 

<LISTV>  <LISTWHT>  <LISTO> 

LIST 

PRINT 





152 


TYPE 

OUTPUT 

<LISTWHT>  <0UANT>  <PHQNS> 

<0UANT>  <DATFORI> 

THE  VECTOR  OF  <UTT>  NAMES 
THE  <MEAS>  ENERGY  IN  THE  BAND 
<LISTO>  ON  THE  <PRDEV> 

FROM  <UTT>  <CARD> 

<PRDEV>  XEROX 

SCOPE 

<WITH>  WITH 

CONTAINING 

PRCCEEDING 

FOLLOWING 

FOLLOWED-BY 

STARTING-WITH 

ENDING-WITH 

PRECEEDED-BY 

<PHONSEG>  <PHONS>  <SEG2> 

<PHONS>  <PHONS>  <SEG2> 

<PHONS>  <PHONS>  <PHONS>  <SEG2> 
<SEG2>  <SEG> 

STRING 

SEQUENCE 

COMBINATION 


<MODDIS> 

<MODV> 


<MODO> 


<MODV>  <MODO> 

BOOST 

DECREASE 

DOUBLE 

ENLARGE 

INCREASE 

REDUCE 

SPREAD-OUT 

THE  <01SCLS> 

FOR  THE  <DISWRD> 


<MOD5EG> 

<MODSV> 


'^MODS0> 

<ADV> 


<MOOSV>  <MODSO> 

ABSORB 

ADD 

INSERT 

F’OSiTION 

TAKE 

<D£T>  <PHONSEG>  <ADV>  <^DET>  <PK)NS> 

AFTER 

BEFORE 

PRECEEDING 

FOLLOWING 


AT 


153 


<SET> 

<SETV> 

^SETO^  ti" 

<ID> 

<1N1T> 

<NAME> 

<D1M> 

<SEX> 

<SITE> 

<TAG> 

<CHNG> 

<CHNGV> 

<PHSG> 

<QUEST> 


<SETV>  THE  <SETO> 

SET 

RESET 

BATCH  <TAG>  TO  <CARD> 
DEFAULT  SPEAKER  TO  <1D> 
DEFAULT  FOR  SEX  TO  <SEX> 
DEFAULT  FOR  SITE  TO  <S1TE> 
COLUMN  <DiM>  TO  <CARO> 
INCREMENT  TO  <CARO> 

<!N!T> 

<NAME> 

JA 

RW 

SM 

CW 

ALLEN 

WIESEN 

MCCANDLESS 

WEINSTEIN 

WIDTH 

HEIGHT 

MALE 

FEMALE 

LL 

BBN 

SRI 

SDC 

CMU 

CODE 

TAG 


<CHNGV>  <DET>  <PHONSEG>  A <PHONS> 
CHANGE  THE  <PHONSEG>  “0  A <PHONS> 
ASSIGN  <PHONS>  TO  THE  <PHONSEG> 
COMPARE  THE  <PHSG>  WITH  THE  <PHSG> 
NAME 

DESIGNATE 

LABEL 

MARK 

CALL 

MAKE 

<PHONSEG> 

<PHONS> 


WHO  OWNS  <UTTOWN> 


154 


<OUESTV> 

<AaIL'? 

<DEVWHR> 
<UTTOWN>  ::= 

<EXISr> 

^WHATS> 

<$>:;* 


WHERE  <AXIL>  <EX1ST> 

<OUESTV>  <UTT>  HAVE  <DATFOR>  ■■DEVWHR> 
WHAT  IS  THE  <WHATS> 

HOW  MANY 
WHAT 
WHICH 
IS 

ARE 

WAS  .. 

ON<nATDEVl>  . 

!N<DATDEV2> 

<UTT>  <CARD> 

•'UTT>  <BY>  <SPKR> 

<SEQI>  <U^T> 

<DATFOR>  FOR  THE  <UTTOWN> 

<UTTOWN> 

<PAR2> 

<DATFOR2> 

■^DATFORl> 

<LAB5>  LABELS 
OWNER'S -NAME 

S 


155 


I 

I 

j 

1 

1 

I 


C.tO.2  LLEXT:  Lincoln  Lab's  "Exttndtd”  Dictionary 


A 

(-.0)  AE 

ABOUT 

(*,0)  (AH  ,0)  B AA  UH  T 

ABOVE 

(-.0)  (AH  ,0)  B AH  V 

ABSORB 

(-.0)  (AH  ,0)  B S OW  (ER  ,0)  9 

ADD 

(-.0)  AE  D 

AFFRICATE 

(-.0)  AE  F ER  (IH  ,0)  K EH  T 

AFTER 

(-.0)  AE  F D ER 

AHEAD 

(-.0)  (AH  ,0)  HH  EH  D 

ALL 

(-.0)  AO  a ,0) 

ALLEN 

(-.0)  AE  L (EH  ,0)  N 

ALONG 

(-.0)  (AH  ,0)  AO  NX 

AMPLITUDE 

(-.0)  AEMPL(1H,0)TUWD 

ANALYSIS 

(-.0)  AE  N AE  L IH  S AX  S 

AND 

(-.0)  AE  N 

ANOTHER 

(-.0)  (AH  ,0)  N AH  DH  ER 

ARE 

(-.0)  AO  ER 

ASPIRATE- . . 

(-.0)  AE  S P ER  (AX  ,0)  T 

ASSIGN 

' (-.0)AHS  AA  IHN 

AT 

(-.0)  AE  T 

AVERAGE 

(*,0)  AE  V ER  IH  (-.0)  D SH 

BACK 

(-.0)  B AE  K 

BACKWARD 

(-.0)  B AE  K W ER  0 

BAND 

(-.0)  B AE  N 0 

BATCH 

(-.0)  B AE  (-.0)  T SH 

BBN 

(-.0)  B lY  B lY  EH  N 

BEFORE 

(-.0)  B IH  F OW  (ER  ,0) 

BEGINNING 

(-.0)  B IH  G IH  N IH  NX 

BELOW 

(-.0)  B IH  L OW 

BETWEEN 

(-.0)  B IH  T W lY  N 

BOOST 

(-.0)  B UW  S T 

BOUNDARY 

(-.0)  B AA  UH  N D (AX  ,0)  ER  lY 

BRIEF 

(-.0)  B ER  lY  F 

BURST 

(-.0)  B ER  S T 

BY 

(-.0)  B AA  IH 

CALCULATE 

(-.0)  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH,AX)  T 

CALCULATED 

(-.0)  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH,AX)  T AX  D 

CALL 

(-.0)  K AO  L 

CHANGE 

(-.0)  T SH  EH  (IH.AX)  N {-,0)  D SH 

CHOICE 

(-,0)  T SH  AO  IH  S 

CLEAN 

(-.0)  K L lY  N 

CLEAR 

(-.0)  K L IH  ER 

CMU 

(-,0)S1,  IMYUW 

CODE 

(-.0)  K OW  D 

COLUMN 

(-.0)  K AA  L (AH  ,0)  M 

COMBINATION 

(-.0)  K AO  M B (IH  ,0)  N EH  (IH,AX)  SH  AX  N 

COMPARE 

(-,0)  K AA  L M P EH  (ER  ,C) 

COMPUTE 

(-.0)  K (AH,Oi  M P (Y  ,0)  UW  T 

COMPUTED 

(-.0)  K AX  M P (Y  ,0)  UW  T AX  D 

COMPUTER 

(-.0)  K AX  M P (Y  ,0)  UW  T ER 

CONFUSION-MATRIX 

(-.0)  K AX  N F Y UW  SH  AX  N EH  (IH.AX)  T ER  IH  K S 

CONNECT 

(-.0)  K (AH  ,P'  N EH  K T 

CONSOLE 

(-.0)  K AA  N j (EH,0)  L 

CONSONANT 

(-,0)  K AA  N S (AH  .0)  N AH  N T 

CONTAINING 

(-,0)  K AX  N T EH  (IH.AX)  N IH  NX 

CONTINUE 

(-,0)  K AX  N T IH  N Y UH 

CPS 

(-.0)  S lY  P lY  EH  S 

CURRENT 

(-.0)  K AH  ER  (AX  ,0)  N T 

CURSOR 

( ,0)  K ER  S ER 

CUTOFF 

(-,0)  K AH  D AO  F 

CW 

(-,0)S  r DAHB(EH,0)  1.  Y UW 

CYCLES-PER-SECOND 

(-.0)  S AA  IH  K iEH,C)  L S P ER  S EH  K AX  N D 

DATA 

(-,0)DEH(IH,AX)D  AH 

DATA-BASE 

(-,r  D EH  (IH.AV)  D AH  B EH  (IH.AX)  S 

DECREASf 

(-,0,  J (lY  ,0)  K ER  lY  S 

DEFAULT 

(-.0)  D IH  F AO  (L  ,0)  T 

D^LCTE 

(-,0)  D (AH  ,C)  1 lY  1 

DESCRIPTOR 

(-,0)  D (IH  ,0)  S K ER  IH  P T ER 

DfSIGNATE 

(-,0/  D EH  S IH  G N (IH,AX)  T 

DIPHIHONG 

(-,0)  D IH  F F AA  NX 

DISK 

( ,0)  D IH  S K 

DISPLAY 

(-.0)  D (IH  ,0)  S P L EH  (IKAX) 

DISTRIBUTION 

(-.0)  D IH  S T (ER  ,0)  B Y UW  SH  (AX  ,0)  N 

0ISTR!BUT!0N-0F(-,0)  D IH  $ T (ER  ,0)  H Y UW  SH  (AX  ,0)  N (AH  0)  V 

DO 

(-.0)  D UW 

DOUBiE 

(-.0)  D A>  B (EH,0)  L 

DROP 

(-.0)D(ER  .0)  AAP 

DRUM 

( ,0)  D (ER  ,0)  AH  M 

DURATION 

(-,0)DER  EH(1H,AX)SH(AH,0)  N 

EACH 

( .0)  r,  ( 0)  T SH 

EARLIIR 

( .0)  ER  L lY  ER 

EDIT 

( ,0)  EH  D IH  T 

EDITt  D 

( ,0)  EH  P ( AX  ,0)  D EH  D 

EFFECT 

(-.0)  lY  F EH  K T 

EIGHT 

( ,0)  EM  ilKAX)  T 

EIGHTEEN 

(-.0)  EH  (IKAX)  T lY  tl 

EIGHTH 

( ,0)  LH  (IH,AX)  F 

EIGHTY 

(-.0)  EH  (IH.AX)  D lY 

ELEVE  J 

(-.0)  lY  L EH  V AX  N 

END 

(-.0)  EH  N D 

ENDING 

(-.0)  EH  N D 'H  NX 

ENDlflC  WITH 

(-,0)  EH  N D IH  NX  W IH  F 

ENERGY 

(-.0)  EH  N ER  (-.0)  D SH  lY 

ENLARGE 

(-,0)  EH  N L AO  (ER  .0)  (-.0)  D SH 

ENTRY  INFORMATION 

ENTRY 

(-.0)  EH  N <T  ,0)  ER  lY  IH  N F ER  M EH  (IH,AX)  SH  (AX  ,0)  N 
(-.0)  EH  N(T  ,0)  ER  lY 

ENVELOPE 

(-.0)  AH  N V (AX  ,0)  L OW  P 

ERASE 

(-,0)  lY  ER  EH  (IH.AX)  S 

EVENT-ARRAY 

(-.0)  lY  V EH  N T (AH  ,0)  ER  EH  (IH,AX) 

EVENT 

(-.0)  lY  V EH  N T 

EVERY 

(-.0)  EH  V ER  lY 

EVERYTHING 

(-.0)  EHV  ER  lYF  IH  NX 

EXAMPLE 

(-.0)  EH  G S AE  M P (EH,0)  L 

FEMALE 

(-.0)  F lY  M EH  (IH.AX)  L 

FIELD 

(-.0)  F !Y  L D 

FIFTEEN 

(-.0)  F IH  F f lY  N 

FIFTH 

(-.0)  F IH  F F 

FIFTY 

(-.0)  F IH  F T lY 

FILE 

( -,0)  F AA  IH  L 

FINAL 

(-.0)  F AA  IH  N (EH,0)  L 

FIND 

(-.0)  F Y N D 

FIRST 

(-.0)  F ER  S (T  ,0) 

FIT 

(-.0)  F IH  T 

FIVE 

(-.0)  F AA  IH  V 

FOLLOWED-BY 

(-.0)  F AO  L OW  B AA  IH 

FOLLOWING 

(-.0)  F AO  L OW  (W  ,0)  (IH  ,0)  NX 

FOR 

(-.0)F(ER,0)(A0  .0) 

FORGET 

(-.0)  F ER  G EH  T 

FORMANT 

(-,0)F  AO  (ER.O)MAHNT 

FORMANTS 

(-.0)  F AO  (ER  ,0)  M AH  N T S 

FORTY 

(-.0)  F AO  T lY 

FORWARD 

(-,0)F  AO(ER,0)WERD 

FOUR 

(-.0)  F AO 

FOURTEEN 

(-.0)  F AO  T lY  N 

FOURTH 

(-.0)  F AO  F 

FRAME 

(-.0)  F ER  EH  (IH.AX)  M 

FREQUENCIES 

(-.0)  F ER  lY  K W EH  N S lY  S 

FREQUENCY 

(-.0)  F ER  lY  K W EH  N S lY 

FRICATIVE 

(-.0)  F ER  IH  K (IH  ,0)  D IH  V 

FROM 

(-.0)  F (ER  ,0)  AH  M 

FRONT 

(-.0)  F ER  AH  N T 

FUNCTIONS 

(-.0)  F AH  N K SH  (AH  ,0)  N 

GAIN 

(-.0)  G EH(IH,AX)  N 

GAIN-TABLE 

(-.0)  G EH  (IH.AX)  N T EH  (IH.AX)  B (EH,0)  L 

GET 

(-.0)  G EH  T 

GET-ME 

(-,0)GEH  (T  ,0)M  lY 

GIVE-ME 

(-.0)  G IH  (V  ,0)  M lY 

GLIDE 

(-.0)  G L AA  IH  D 

GLOTTAL 

(-.0)  G L AA  D(EH,0)  L 

GO 

(-.0)  G OW 

GO-INTO 

(-,0)  G OW  (W  .0)  IH  N T UW 

GO -ON  jn 

(-,0)  G OW  AA  N T UW 

GRAPH 

(-.0)  G ER  AE  F 

GRAPHICS 

(-,0)  G ER  AE  F IH  K S 

GREATER 

(-.0)  G (S  ,0)  ER  FH  (IH,AX)  D ER 

HALE 

(-,0)  HH  AE 

tIAND 

(-.0)  HH  AE  N D 

t^AVE 

(-.0)  HH  AE  V 

HEADER 

(-.0)  HH  EH  D ER 

HEIGHT 

(-.0)  HH  AA  IH  r 

HERE 

(-.0)  HH  lY  (ER  ,0) 

HIGH 

(-.0)  HH  AA  IH 

HIGHER 

(-.0)  HH  AA  IH  ER 

HIGHEST 

(-.0)  HH  AA  IH  5 T 

EtOMOMORPHIC 

(-,0)  HH  OW  M (OW  ,01  M OW  (ER  ,0)  F IH  K 

HOW 

(-.0)  HH  A A UH 

HUGHES 

(-.0)  HH  Y UW  S 

HUNDRED 

(-.0)  HH  AH  N D ER  IH  D 

I ONLY-WANT 

(-.0)  AA  IH  CW  N L lY  W AA  N T 

I WANT-ONLY 

(-,0)  AA  IH  W AA  N T OW  N L lY 

I-WANT-TO-SEE 

(-.0)  AA  IH  W AA  N T UW  S lY 

IN 

(-.0)  IH  N 

IN-THE 

(-.0)  IH  N DH  (AH  ,0) 

INCREASE 

(-.0)  IHN  K ER  lY  S 

INCREASING 

(-.0)  IH  N K ER  lY  S IH  NX 

INCREMENT 

(-.0)  IH  N K ER  M Eh  N T 

INFORMATION 

(-,0)  IH  N F ER  M EH  (IH.AX)  SH  (AX  ,0)  N 

INITIAL 

(-,0)IHNIH  SH(AX,0)L 

INITIALIZE 

(-.0)  IH  N IH  SH(AX  ,0)  L AA  IH  S 

INPUT 

(-.0)  IH  N P UW  T 

INSERT 

(-.0)  IH  N S ER  T 

INTERVOCALIC 

(-.0)  lY  N T ER  V OW  K AE  L IH  K 

INTO 

(-.0)  IH  N T UW 

IS 

(-,0)  lY  S 

JA 

(-.0)  D SH  EH  (IH.AX)  (D  ,0)  EH  (IH,AX) 

KEEP 

(-.0)  K lY  P 

KILOHERTZ 

(-.0)  K (S  ,0)  IH  L OW  HH  ER  T S 

LAHEL 

(-.0)  L EH  CH.AX)  B (EH,0)  L 

LABELS 

(-,0)  L EH(IH,AX)  B (EH,0)  L S 

LADIAL 

(-.0)  L EH  (IH.AX)  B lY  (EH,0)  L 

LAST 

(-.0)  L AE  S T 

LATER 

(-.0)  L EH(IH,AX)  D ER 

LEAST 

(-.0)  L lY  S T 

LEFT 

(-.0)  L EH  F T 

LESS 

(-.0)  L EH  S 

LET-ME-SEE 

(-.0)  L EH  (IH.AX)  T M lY  S lY 

LETS-SEE 

(-.0)  L EH  (IH.AX)  S lY 

LEVEL 

(-.0)  L EH  V (EH.O)  L 

LEXICAL-TRANSCRIPTION 

(-,0)  1.  EH  K S (IH  ,0)  K (EH.O)  L 

T ER  AE  N S K CR  IH  P SH  (AH  ,0)  N 

LINE 

(-,0)  L AA  IH  N 

LIQUID 

(-.0)  L IH  K W IH  D 

LIST 

(-.0)  L IH  S T 

159 


LL 

(-.0)  EH  L EH  L 

LOAD 

(-,0)L0W0 

LOCATE 

(-,0)L  OW  K EH(IH,AX)  T 

LONGER 

(-.0)L  AON  G ER 

LONGEST 

(-,0)L  AO  N G IH  S T 

LOW 

(-.0)  L OW 

LOWER 

(-.0)  L OW  (W  ,0)  ER 

LOWEST 

(-.0)  L OW  (W  ,0)  IH  S T 

MAKE 

(-,0)MEH(IH,AX)K 

MALE 

(-.0)  M EH  (IH,AX)  L 

MANY 

(-.0)  M EH  N lY 

MARK 

(-.0)M  AA  (ER  ,0)K 

MARKER 

(-.0)  M AA  (ER  .0)  K ER 

MATCHES 

(-.0)  M AE  (-.0)  T SH  AX  S 

MAXIMA 

(-.0)  M AE  K S (IH  ,0)  M AH 

MAXIMUM 

(-.0)  M AE  K S (AX  ,0)  M AH  M 

MCCANDLESS 

(-.0)  M IH  K AE  N D L IH  S 

MEAN-VALUES 

(-.0)  M lY  N V (AE  ,0)  L Y UW  (S  ,0) 

MID 

(-.0)  M IH  D 

MILLISECONDS 

(-.0)  M AH  L (AH  ,0)  S EH  K (AX  ,0)  N (T  ,0)  S 

MINIMA 

(-.0)  M IH  N (IH  ,0)  M AH 

MINIMUM 

(-.0)  M IH  N (AX  ,0)  M AX  M 

MODE 

(-,0)M0WD 

MOMENT 

(-.0)  M OW  M EH  N T 

MOST 

(-.0)  M OW  S T 

MOVE 

(-.0)  M UW  V 

MY 

(-.0)  M AA  IH 

NAME 

(-.0)  N EH  (IH.AX)  M 

NAMES 

(-,0)NEH(IH,AX)MS 

NASAL 

(-.0)  N EH  (!H,AX)  S (EH,0)  L 

NEXT 

(-,0)  N EH  K S T 

NINE 

(-,0)  N AA  IH  N 

NINETEEN 

(-.0)  N AA  IH  N T lY  N 

NINETY 

(-.0)  N AA  IH  N D lY 

NINTH 

(-.0)  N AA  IH  N F 

NORMALIZE 

(-.0)  N OW  (ER  ,0)  M (EH.O)  L AA  IH  S 

NORMALIZED 

(-.0)  N OW  (ER  ,0)  M (EH,0)  L AA  IH  S D 

NOW 

(-.0)  N A A UH 

NUMBER 

(-.0)  N AH  M B ER 

OCCURANCE 

(-.0)  (AH  ,0)  K ER  EH  N S (IH  ,0) 

OF 

(-0)  (AH  ,0)  V 

ON 

(-.0)  (AA  ,0) 

ONE 

(-.0)  W AH  N 

ONLY 

(-.0)  OW  N L IV 

ONTO 

(-.0)  AA  lY  N T Y (UW  ,0) 

OTHER 

(-.0)  AH  OH  ER 

OUT 

(-.0)  AA  UH  T 

OUTPUT 

(-.0)  AA  UH  (T  ,0)  P UH  T 

OVER 

(-.0)  OW  V ER 

OWNER’S-NAME 

(-.0)  OW  N ER  S N EH  (IH,AX)  M 

OWNS  (-,0)  OW  N S 

PARSE  (-.0)  P AA  (ER  ,0)  S 

PARSE-TREE  (-.0)  P AA  (ER  .0)  S T ER  lY 

PART  (-.0)  P AA  (ER  ,0)  T 

PERCENTAGES  (-.0)  P ER  S EH  N T EH  (IH,AX)  (-.0)  D SH  S 

PHONEME  (-.0)  F OW  N lY  M 

PHONEMIC  (-.0)  F OW  N lY  M IH  K 

PHONEMIC-TRANSCRIPTION 

(-.0)  F OW  N lY  M IH  X T ER  AE  N S K ER  IH  P SH  (AH  ,0)  N 
(-.0)  F ER  EH  (IH,AX)  S (AX  ,0) 

(-.0)  P IH  K AA  UH  T 
(-.0)  P IH  (-.0)  T SH 
(-.0)  P UY  S 
(-.0)  P L OW  S IH  V 
(-.0)  P L AA  T 
(-.0)  P AO  IH  N T 
(-.0)  P AO  IH  N T ER 
(-.0)  P AH  S IH  SH  (AH  ,0)  N 
(-.0)  P ER  lY  S lY  D AX  D AA  IH 
(-.0)  P ER  lY  S lY  D IH  NX 


PHRASE 
PICKOUT 
PITCH 
PLEASE 
PLOSIVE 
PLOT 
POINT 
POINTER 
POSITION 
PRECEEDFDBY 
PRECEEDING 
PREDICTIVE-CODING 


(-.0)  P ER  lY  D IH  (K  ,0)  T IH  (V  ,0)  K OW  D AX  NX 

PREVIOUS 

(-.0)PER  lYV  Y AHS 

PRIMARY 

(-,0)PER  AA  IHM(AX  ,0)  (ER  ,0)  lY 

PRINT 

(-.0)  P ER  IH  N T 

PROBABILITIES 

(-.0)  P ER  AA  B (AH  ,0)  B IH  L IH  D lY  S 

PROBI  FM 

(-,0)  F ER  A A C L AH  M 

PROCEED 

(-.0)  P ER  OW  S lY  D 

PUT 

(-.0)  P UH  T 

PUT-' IP 

(-.0)  P UH  T AH  P 

PUTTING 

(-.0)  P UH  D IN  m 

RANGE 

(-.0)  ER  EH  (IH,AX)  N (-.0)  D SH 

RANGE -OF 

(-.0)  ER  EH  (1H.AX)  N (-.0)  D SH  (AH  ,0)  V 

READ 

(-.0)  ER  lY  D 

READ  IN 

(-,0)  ER  lY  D !H  N 

RECALCULATE 

(-.0)  ER  lY  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH.AX)  T 

RECALCULATED 

(-.0)  ER  lY  K AE  L K (Y  ,0)  (AX  ,0)  L EH  (IH,AX)  T AX  D 

RECOMPUTE 

(-.0)  ER  lY  K (AX  .0)  M P (Y  ,0)  UW  T 

RECOMPUTED 

(-.0)  ER  lY  K (AX  ,0)  M P (Y  ,0)  UW  T AX  D 

REDISPLAY 

(-.0)  ER  lY  D (IH  ,0)  S P L EH  (IH.AX) 

REDO 

(-.0)  ER  lY  D UW 

REDUCE 

(-.0)  ER  lY  D UW  S 

REDLr.ING 

(-.0)  ER  lY  D (Y  ,0)  UW  S IH  NX 

REFRESH 

(-.0)  ER  lY  F ER  EH  SH 

REINITIALIZE 

(-.0)  ER  lY  IH  N IH  SH  (AX  ,0)  AA  IH  S 

REMOVE 

,(-.0)  ER  lY  M UW  V 

RESET 

(-,0)  ER  lY  S EH  T 

RESULT 

(-.0)  ER  lY  S AH  L T 

RETRIEVE 

(-.0)  ER  lY  T (S  ,0)  ER  lY  V 

RETURN-TO 

(-.0)  ER  lY  T ER  N T V (UW  ,0) 

161 


REWIND 
RIGHT 
RW 

I SAMPLE 

SAVE 
SCAN-CONVERTOR 

SCOPE 
SCRATCH 
SCREEN 
SDC 

SEARCH 
SECOND 
SECONDARY 
SECONDS 
SECTION 
SEGMENT 
SEGMENTATION 
SELECT 
SEMIVOWEL 
SENTENCE 
SEQUENCE 
SET 

SET-TO 

I SETTING 

* SEVEN 

SEVENTEEN 
SEVENTH 
SEVENTY 
SEX 
SHIFT 
SHIFT-TO 
SHORTER 
SHORTEST 
SHOW-ME 
SILENCE 
SITE 
SIX 

SIXTEEN 
SIXTH 
SIXTY 
SKIP 

SKIP-OVER 
SKIP-TO 
SLICES 
SLIDE 
SLOT 
SM 

SONORANT 
SPEAKER 


ER  lY  W AA  IH  N D 
ER  AA  IH  T 

AH  ER  D AH  B (EH.O)  L Y UW 
S AE  M P (EH,0)  L 
S EH  (IH.AX)  V 

(-.0)  S K AE  N K (AX  ,0)  N V ER  D ER 

(-.0)  S K OW  P 

(-.0)  S K ER  AE  (-.0)  T SH 

(-.0)  S K ER  lY  N 

(-.0)  EH  S D lY  S lY 

(-.0)  S ER  (-.0)  T SH 

(-.0)  S EH  K AX  N D 

(-.0)  S EH  K (AX  ,0)  N D EH  (ER  ,0)  (lY  ,0) 

(-.0)  S EH  K AX  N S 

(-.0)  S EH  (IH,AX)  K SH  (AH  ,0)  N 

(-.0)  S EH  G M EH  N T 

(-.0)  S EH  G M (EH  ,0)  EH  (IH.AX)  N T EH  (IH,AX)  SH  AX  N 

(-.0)  S (AH  ,0)  EH  K T 

(-.0)  S EH  M lY  V AA  UH  L 

(-.0)  S EH  N (T  AX  N ,0)  S 

(-.0)  S lY  K W EH  N S 

(-.0)  S EH  T 

(-.0)  S EH  T UW 

(-.0)  S EH  D IH  NX 

(-.0)  S EH  V EH  N 

(-.0)  S EH  V (EH  ,0)  N T lY  N 

(-.0)  S EH  V AX  N F 

(-.0)  b EH  V AX  N D lY 

(-.0)  S EH  K S 

(-.0)  S IH  F T 

(-.0)  SH  IH  F T Y (UW  ,0) 

(-.0)  SH  AO  (ER  ,0)  T ER 

(-.0)  SH  AO  (ER  ,0)  T AX  S T 

(-.0)  SH  OW  M lY 

(-.0)  S AA  IH  L (IH  ,0)  N S 

(-.0)  S AA  IH  T 

(-.0)  S IH  K S 

(-.0)  S IH  K S T lY  N 

(-.0)  S IH  K S F 

(-.0)  S IH  K S D lY 

(-.0)  S K IH  P (S  ,0) 

(-.0)  S K IH  P OW  V AH 
(-.0)  S K IH  T Y (UW  ,0) 

(-.0)  S L AA  IH  S AH  S 

(-.0)  S L AA  IH  D 

(-.0)  S L AA  T 

(-.0)  EH  S EH  M 

(-.0)  S OW  L N ER  (AX  ,0)  N T 

(-.0;  S P lY  K ER 


(-.0) 

(-.0) 

(-.0) 

(-.0) 

(-.0) 


J 


SPECTRA 

SPECTRAL 

Sf’ECTROGRAM 

SPECTRUM 

SPOKEN-BY 

SPREAD-OUT 

SRI 

STARTING 

STARTING-WITH 

STATEMENT 

STOP 

STORE 

STRESS 

STRING 

SWITCH-tq 

TAG 

TAKE 

TAPE 

TAPE-UNIT 
TEN 
TENTH 
TERMINAL 
TERTIARY 
THAN 
THAT 
THE 
THERE 

these 

THIRD 

thirteen 
thirty 

THIS 
THOSE 
THREE 

threshold 
throw-away 

TIME 
TO 

TOTAL 
TRANSITION 
TRY-rO-FIND 
TWELVE 
TWENTY 
TWO 
TYPE 
UNDER 
UNIT 

UNVOICED 
UTTERANCE 
VALUE 


(-.0)  S P EH  (K  ,0)  T ER 
(-.0)  S P EH  (K  ,0)  T (ER  ,0)  (EH,0)  L 
(-.0)  S P EH  (K  ,0)  T ER  G ER  AE  M 
(-.0)  S P EH(K  ,0)  T ER  AH  M 
(-.0)  S P OW  K (AH  ,0)  N B AA  IH 
(-.0)  S P ER  EH  D AA  UH  T 
(-.0)  EH  S AH  ER  AA  IH 


(-.0)  S 
(-.0)  S 
(-.0)  S 
(-.0)  S 
(-.0)  S 
(-.0)  S 
(-.0)  S 


AA  (ER  ,0)  D IH  NX 
AA  (ER  ,0)  D IH  NX  W IH  F 
EH  (IH,AX)  (T  ,0)  M EH  N (T  ,0) 
AA  P 
AO 

(ER  ,0)  EH  S 
(ER  ,0)  IH  NX 
(-.0)  S W IH  (-,0)  T SH  T Y UW 
(‘,0)  T AE  G 

(-,0)T(S,0)  EH(IH,AX)K 

(-.0)  T (S  ,0)  EH  (IH,AX)  P 

(-.0)  T (S  ,0)  EH  (IH.AX)  P IH  N IH  T 

(-.0)  T EH  N 

(-.0)  T EHNF 

(-.0)  T ER  M (IH  ,0)  N (EH,0)  L 
(-.0)  T ER  SH  (AX  ,0)  ER  IV 
(-.0)  DH  AE  N 
(-,0)  DH  AA  T 
(-.0)  DH  (AH  ,0) 

(-.0)  DH  EH  ER 
(-.0)  DH  lY  S 
(-,0)  F ER  D 
(-,0)  F ER  T lY  N 
(-.0)  F ER  D lY 
(-,0)  DH  IH  S 
(“,0)  DH  OW  S 
(',0)  F ER  lY 

(-.0)  F ER  EH  SH  (EH,0)  L D 

(-.0)  F ER  OW  (W  ,0)  (AH  ,0)  (W  ,0)  EH  (IH,/! 

(-.0)  T AA  IH  M 

(-.0)  T Y (UW  .0) 

(-.0)  T OW  0 (EH.O)  L 
(-,0)  T ER  AE  N S IH  SH  (AH  ,0)  N 
(-,0)  T ER  AA  IH  T AH  F AA  IH  N D 
(-,0)  T W EH  L V 

lY 


(-,0)  T W EH  N T 
(-,0)  T Y UW 
(-.0)  T AA  IH  P 
(-.0)  AH  N D ER 
(-.0  Y UW  N IH  T 
(-,0)  AH  N V AO  IH  S T 
(*,0)  AH  D ER  (EH  ,0)  N S 
(-,C)  V (AE  .0)  L Y UW 


162 


163 


VECTOR 

(-.0)  V EH  (K  ,0)  T ER 

VOICED 

(-.0)  V AO  IH  S T 

VOICELESS 

(-.0)  V AO  IH  S L IH  S 

VOICING 

(-.0)  V AO  IH  S IH  NX 

VOWEL 

(-.0)  V AA  UH  L 

WAS 

(-.0)  W AH  S 

WAVEFORM 

(-,0)WEH(IH,AX)VFERM 

WEINSTEIN 

(-.0)  W AA  IH  N S T AA  IH  N 

WELL 

(-.0)  W EH  L 

WHAT 

(-.0)  HH  W AA  T 

WHERE 

(-.0)  HH  W EH  ER 

WHICH 

(-.0)  HH  W IH  (-,0)  T SH 

WHO 

(-,0)  HH  UW 

WIDTH 

(-.0)  W IH  D F 

WIESEN 

(-.0)  W IV  S AX  N 

WILL 

(-,0)  W IH  L 

WITH 

(-.0)  W IH  F 

WORD 

(-.0)  W ER  D 

WRITE 

(-.0)  ER  AA  IH  T 

XEROX 

(-.0)  S IH  ER  AA  K S 

ZEROCROSSING-DENSITY 

(-,0)  S IH  ER  OW  X ER  AA  S IH  NX 

ZEROCROSSINGS 

[ 

] 

(-.0)  S IH  ER  OW  K ER  AA  S IH  NX 

D EH  N S IH  D IV 
S 


, Ni.  L'NSsirirn 


SCCuMiTy  CLASSIFICATION  OF  THIS  ^AGC  (TFAan  Dm*  Kni*r«<0 


documentation  PAGE 


v-Nyoisfwtn^uaj^-  “*•- 

j-jfp.  7 7 " 


A 


It.  OOVT  ACCeSSIOM  NO 


4~  ^rrrrT«n<  SuSnii*;  ' ___ 

ANALYSIS  OF  IJVNGUAGES  FOR  MAN-MAlCfliNK 
toICE  COMMUNICATION- 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


S.  NCCINiCnT'S  catalog  humicn 


7 


S.  TVNI  OF  WtNOWT  • NBNIOO  COVSNSO 


ter  Im 


t ; 

FwCTWTTTTO^Tlr 


T ^kutMOAf*) 


/4 


4 


Robert  Gary/Goodman 


I.  CONTNACT  on  ONAnT  NU.MaiNrcJ 

1//^  i”  ” ””  ^ 

\ F4A62O.73-C-0O74 


16.  NNOONAM  tLEI»ENT,-N(W3jirC+;  T» 
ANCA  S NONK  UNIT  NUMICNS 

61101D 
AO  244fi 


» ^cal^oaMiNO  oaoaniVation  namc  and  Aooatss 

Carnegie-Mel Ion  Unlversltv 
Computer  Science  Dept. 
Pittsburgh.  PA  15213 


1 1 CONTNOLLINO  OFFiCt  NAMC  AND  AOOACSS 

Defense  Advanced  Research  Projects  Agency 


M*-  'MPOWT'DATf 

1^76  ' 


1400  Wilson 


Blvd 

222Q9 


tiSiCT’oT303fr 


n MONiToftTfiO  AOlN^  HAUt  ^ioont%t(ll  trom  ConttolUni  Othc*) 

Air  Force  Office  of  Scientific  Research  (NN) 


169  u. 

IS.  SCCUSITV^tH^ACC  (Mlklt  ■' 


UNCUSSIFIED 


Bolling  AFB,  DC  20332 


IS*.  6cdLASSlFlCATldN/0O«N0NA0lNO 

schcoulc 


I*  OISTNICuTiON  STATCMCNT  r*l  mi»  S*F*rO 


Approved  for  public  release;  distribution  unlimited, 


17  OISTSiauTlON  STATCMCNT  (•!  Ml*  *k*ira«l  «nl***F  In  »l»»t  J0,  II  MlMrant  «MM  F«|Wfi; 


IS  sufflcmcntasv  notcs 


IS.  kCv  srOSOS  (Cmllmi»  *n  nnttt  *IF*  II  *nF  IMwilllF  *r  MaeS  mmibt) 


to  aSSTSACT  rCaillMM  *n  r*vw**  *fF*  If  n*c****>r  anF  IFwillfF  Sr  SI*eS  manS*0 

see  attached 


^ r 

I 


k 


DO 


FONM 
1 JAN  7S 


1473  KOITIOM  of  I NOV  SI  »•  OOlOLKTC 

s/N  oies-ou'ssoi  I 


UNCLASSIFIED 

•KCUNITY  CLAMFlCATlON  OF  TNIS  FAOC  ftMn 


